Inventory Monitoring at Distribution Centers¶

Udacity AWS Machine Learning Engineer - Capstone Project¶

This is a Deep learning project done as part of Udacity's capstone project of AWS Machine Learning Engineer Nanodegree Program.

Distribution centers often use robots to move objects as a part of their operations. Objects are carried in bins which can contain multiple objects. Occasionally, items are misplaced while being handled, so the contents of some bin images may not match the recorded inventory of that bin.

Now, this project is about building a model that can count the number of objects in each bin. A system like this can be used to track inventory and make sure that delivery consignments have the correct number of items

The solution here is to use AWS SageMaker and good machine-learning engineering practices to fetch data from Amazon Bin Image Dataset, preprocess it, and then train a pre-trained model that can classify the image based on the number of objects in the bin

In [ ]:
# Install packages
import sys
!{sys.executable} -m pip install smdebug torch torchvision tqdm ipywidgets bokeh
In [ ]:
! apt-get update && apt-get install ffmpeg libsm6 libxext6  -y
In [ ]:
! pip install easydev colormap colorgram.py extcolors
In [6]:
# Importing packages
%matplotlib inline

import os
import json
import boto3
import sagemaker

import torch
import torch.nn as nn
import torch.nn.functional as F
import IPython

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import random
import cv2

from PIL import Image
from tqdm import tqdm
from sagemaker.tuner import CategoricalParameter, ContinuousParameter, HyperparameterTuner, IntegerParameter
from sagemaker.pytorch import PyTorch
from sagemaker.debugger import Rule, DebuggerHookConfig, TensorBoardOutputConfig, CollectionConfig, ProfilerRule, rule_configs, ProfilerConfig, FrameworkProfile
from sagemaker.analytics import HyperparameterTuningJobAnalytics
from sagemaker.pytorch import PyTorchModel
from sagemaker.predictor import Predictor
from sagemaker import get_execution_role
from sklearn.model_selection import train_test_split

Data Preparation¶

Running this cell below will download the data.

The cell below creates a three folders called dataset, downloads training, validation, and testing data and arranges it in subfolders. Each of these subfolders contain images where the number of objects is equal to the name of the folder. For instance, all images in folder 1 has images with 1 object in them.

In [11]:
def download_images(files_list, data_path):
    s3_client = boto3.client('s3')
    data_path = os.path.join('dataset', data_path)

    for k, v in files_list.items():
        print(f"Downloading Images with {k} objects to {data_path}")
        directory=os.path.join(data_path, k)
        if not os.path.exists(directory):
            os.makedirs(directory)
        for file_path in tqdm(v):
            file_name=os.path.basename(file_path).split('.')[0]+'.jpg'
            s3_client.download_file('aft-vbi-pds', os.path.join('bin-images', file_name),
                             os.path.join(directory, file_name))

def download_and_arrange_data():
    with open('misc/file_list.json', 'r') as f:
        d=json.load(f)

    #spliting data into 65% for traininig, 20% for validationa and 15% for testing
    train = {}
    test = {}
    validation = {}
    for k, v in d.items():
        train[k], test[k] = train_test_split(d[k], test_size=0.35, random_state=0)
        test[k], validation[k] = train_test_split(test[k], test_size=0.60, random_state=0)

    download_images(train, 'train')
    download_images(test, 'test')
    download_images(validation, 'valid')

download_and_arrange_data()
  0%|          | 2/798 [00:00<00:59, 13.27it/s]
Downloading Images with 1 objects to dataset/train
100%|██████████| 798/798 [01:19<00:00, 10.00it/s]
  0%|          | 1/1494 [00:00<03:12,  7.76it/s]
Downloading Images with 2 objects to dataset/train
100%|██████████| 1494/1494 [02:33<00:00,  9.73it/s]
  0%|          | 2/1732 [00:00<02:26, 11.81it/s]
Downloading Images with 3 objects to dataset/train
100%|██████████| 1732/1732 [02:59<00:00,  9.66it/s]
  0%|          | 1/1542 [00:00<03:01,  8.51it/s]
Downloading Images with 4 objects to dataset/train
100%|██████████| 1542/1542 [02:34<00:00,  9.95it/s]
  0%|          | 1/1218 [00:00<02:29,  8.16it/s]
Downloading Images with 5 objects to dataset/train
100%|██████████| 1218/1218 [02:04<00:00,  9.77it/s]
  1%|          | 1/172 [00:00<00:21,  7.78it/s]
Downloading Images with 1 objects to dataset/test
100%|██████████| 172/172 [00:16<00:00, 10.28it/s]
  1%|          | 2/322 [00:00<00:30, 10.66it/s]
Downloading Images with 2 objects to dataset/test
100%|██████████| 322/322 [00:33<00:00,  9.61it/s]
  1%|          | 2/373 [00:00<00:33, 10.96it/s]
Downloading Images with 3 objects to dataset/test
100%|██████████| 373/373 [00:38<00:00,  9.80it/s]
  0%|          | 1/332 [00:00<00:49,  6.69it/s]
Downloading Images with 4 objects to dataset/test
100%|██████████| 332/332 [00:33<00:00,  9.78it/s]
  0%|          | 1/262 [00:00<00:39,  6.64it/s]
Downloading Images with 5 objects to dataset/test
100%|██████████| 262/262 [00:26<00:00, 10.03it/s]
  0%|          | 1/258 [00:00<00:39,  6.54it/s]
Downloading Images with 1 objects to dataset/valid
100%|██████████| 258/258 [00:25<00:00,  9.93it/s]
  0%|          | 1/483 [00:00<00:57,  8.41it/s]
Downloading Images with 2 objects to dataset/valid
100%|██████████| 483/483 [00:50<00:00,  9.57it/s]
  0%|          | 2/561 [00:00<00:45, 12.41it/s]
Downloading Images with 3 objects to dataset/valid
100%|██████████| 561/561 [00:57<00:00,  9.74it/s]
  0%|          | 2/499 [00:00<00:38, 12.78it/s]
Downloading Images with 4 objects to dataset/valid
100%|██████████| 499/499 [00:51<00:00,  9.75it/s]
  0%|          | 1/395 [00:00<00:48,  8.15it/s]
Downloading Images with 5 objects to dataset/valid
100%|██████████| 395/395 [00:39<00:00,  9.93it/s]

Dataset¶

Our dataset is very big considering 500,000 images, here in our project we going to consider only a small chunk of this dataset, about 10441 images split between training, validation and testing, and to evaluate the model performance than launch a big training job of all the dataset, this approach will help us reduce cost and time when developing new machine learning models.

In [12]:
# Perform data cleaning or data preprocessing
train_df = 'dataset'
 
folders =  os.listdir(train_df)
 
bin_images = pd.DataFrame()
for folder in folders:
    allCategories =  os.listdir(os.path.join(train_df, folder))
    for category in allCategories:
        allFiles = os.listdir(os.path.join(train_df, folder, category))
        files = []
        for file in allFiles:
            bin_images = bin_images.append({'image_name': os.path.join(train_df, folder,category,file),
                                            'category': category,
                                            'type' : folder },
                                            ignore_index = True) if ('.jpg' in file) else None
In [13]:
bin_images.head()
Out[13]:
category image_name type
0 4 dataset/test/4/102257.jpg test
1 4 dataset/test/4/103706.jpg test
2 4 dataset/test/4/05561.jpg test
3 4 dataset/test/4/01169.jpg test
4 4 dataset/test/4/105128.jpg test
In [14]:
bin_images.describe()
Out[14]:
category image_name type
count 10441 10441 10441
unique 5 10441 3
top 3 dataset/train/2/06072.jpg train
freq 2666 1 6784
In [15]:
bin_images.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10441 entries, 0 to 10440
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   category    10441 non-null  object
 1   image_name  10441 non-null  object
 2   type        10441 non-null  object
dtypes: object(3)
memory usage: 244.8+ KB
In [16]:
type_plot = bin_images['type'].value_counts().plot.bar()
plt.title('Dataset split percentage')
plt.xlabel('Dataset type')
plt.ylabel('Number of Images')
plt.savefig('results/dataset_split_percentage.png')
plt.figure()
category_plot = bin_images['category'].value_counts().plot.bar()
plt.title('Data distrution between target features')
plt.xlabel('Target bin count')
plt.ylabel('Number of Images')
plt.savefig('results/data_distrution_between_target_features.png')
In [17]:
fig = plt.figure(figsize=(30, 10))
rows, columns = 3, 6

sampled_list = random.sample(bin_images.values.tolist(), 18)

for index, sample in enumerate(sampled_list):
    fig.add_subplot(rows, columns, index+1)
    #print(sample[0])
    plt.imshow(Image.open(sample[1]))
    plt.axis('off')
    plt.title(sample[0], fontsize=20)
    
plt.title('Images with their target count')
plt.savefig('results/images_with_their_target_count.png')
In [26]:
from utils import exact_color
exact_color(sampled_list[4][1], 900, 12, 2.5)
In [27]:
exact_color(sampled_list[8][1], 900, 12, 2.5)
In [28]:
# load color image 
bgr_img = cv2.imread(sampled_list[1][1])
# convert to grayscale
gray_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2GRAY)

# normalize, rescale entries to lie in [0,1]
gray_img = gray_img.astype("float32")/255
plt.figure(figsize=(15, 10))
# plot image
plt.imshow(gray_img, cmap='gray')
plt.savefig('results/gray_img.png')
plt.show()
In [29]:
# Defining four different filters, 
# all of which are linear combinations of the `filter_vals` defined above

filter_vals = np.array([[-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1]])
filter_1 = filter_vals
filter_2 = -filter_1
filter_3 = filter_1.T
filter_4 = -filter_3
filters = np.array([filter_1, filter_2, filter_3, filter_4])
# visualize all four filters
fig = plt.figure(figsize=(10, 5))
for i in range(4):
    ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
    ax.imshow(filters[i], cmap='gray')
    ax.set_title('Filter %s' % str(i+1))
    width, height = filters[i].shape
    for x in range(width):
        for y in range(height):
            ax.annotate(str(filters[i][x][y]), xy=(y,x),
                        horizontalalignment='center',
                        verticalalignment='center',
                        color='white' if filters[i][x][y]<0 else 'black')
In [31]:
from utils import Net, viz_layer

# instantiate the model and set the weights
weight = torch.from_numpy(filters).unsqueeze(1).type(torch.FloatTensor)
model = Net(weight)
plt.figure(figsize=(10, 6))
plt.imshow(gray_img, cmap='gray')

fig = plt.figure(figsize=(15, 10))
# visualize all filters
fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)
for i in range(4):
    ax = fig.add_subplot(1, 4, i+1, xticks=[], yticks=[])
    ax.imshow(filters[i], cmap='gray')
    ax.set_title('Filter %s' % str(i+1))

    
# convert the image into an input Tensor
gray_img_tensor = torch.from_numpy(gray_img).unsqueeze(0).unsqueeze(1)

# get the convolutional layer (pre and post activation)
conv_layer, activated_layer = model(gray_img_tensor)

# visualize the output of a conv layer
viz_layer(conv_layer)
[2023-04-04 21:19:13.119 pytorch-1-6-cpu-py36--ml-t3-medium-370ee60fbc7a856e8f67ac271515:33 INFO utils.py:27] RULE_JOB_STOP_SIGNAL_FILENAME: None
[2023-04-04 21:19:13.231 pytorch-1-6-cpu-py36--ml-t3-medium-370ee60fbc7a856e8f67ac271515:33 INFO profiler_config_parser.py:102] Unable to find config at /opt/ml/input/config/profilerconfig.json. Profiler is disabled.
In [32]:
# after a ReLu is applied
# visualize the output of an activated conv layer
viz_layer(activated_layer)

Data Insights¶

1- Looking at the charts above we are seeing that images count are not uniform and that could lead to our model to be biased because when training our model will see more images from one caterory count than the others, example the "3" category have double the images compared to the "1" category

2- Again we slipted our data with 66% training, 22% validation and 12% testing, and i think this is a good balance for our model.

3- Now when seeing random images from our dataset, I've noticed that the images are a bit dificult to distinguich items count because:

  • the image colors and saturation
  • the presence of the packaging tape that cover some of the details
  • items are rapped in same packaging papers
  • position and angle of the items (too close to each other, they can be seen as one item)

4- Some data cleaning and preparation is required. above we used a method of extracing dominant color from the image, and we noticed that the most dominant color is the color of the packaging this will not help our model train better because we need only useful information to pass to our model, (this method was taken from a publish from this URL https://towardsdatascience.com/image-color-extraction-with-python-in-4-steps-8d9370d9216e)

5- visulizing our image after they passed throught some CNN filters, we see that some filters make the packaging tape dominant above items and that will make training worse, other filter remove the packaging tape but make the image have information to pass to our model. (this method was taken from a udacity course materials from this URL https://github.com/udacity/machine-learning/blob/master/projects/practice_projects/cnn/conv-visualization/conv_visualization.ipynb)

Finally, since this dataset can be passed to our model, I've decided to let it raw and upload it directly to AWS S3 Bucket.

Uploading data to AWS S3¶

In [ ]:
# Upload the data to AWS S3
!aws s3 cp 'dataset' s3://udacity-capstone-project-2023/ --recursive
In [27]:
!aws s3 cp 'models' s3://udacity-capstone-project-2023/models --recursive
upload: models/resnet34_best.pth.tar to s3://udacity-capstone-project-2023/models/resnet34_best.pth.tar

Hyperparameter Tuning¶

First, we will start by finding the best hyperparameters by launching a hyperparameter tuning job with our pretrained model. the choice of hyperparameters ranges is arbitrary and the two most imporant parameters are the learning rate and the batch-size.

  • The learning rate: is very important in speeding the learning process, as a wrong / to small learning rate can lead to overfitting, but a too large one might create non-optimal results as well.

  • The batch size: is also very important as it controls the accuracy of the estimate of the error gradient when training neural networks.

here we will use the hpo.py script to to do hyperparameter tuning, and we gonna use a ml.g4dn.xlarge for speed up the work sine we have only one available.

In [15]:
role = sagemaker.get_execution_role()
In [16]:
# Declare Hyperparameter ranges, metrics etc.
hyperparameter_ranges = {
    "learning_rate": ContinuousParameter(0.001, 0.1),
    "batch_size": CategoricalParameter([32, 64, 128, 256, 512]),
    "epochs": IntegerParameter(10,40)
}

objective_metric_name = "Average Test loss"
objective_type = "Minimize"
#metric_definitions = [{"Name": "Test Loss", "Regex": "Testing Loss: ([0-9\\.]+)"}]
metric_definitions = [{"Name": "Average Test loss", "Regex": "Average Test loss: ([0-9\\.]+)"}]
In [42]:
# Create the HyperparameterTuner estimator
estimator = PyTorch(
    entry_point="code/hpo.py",
    base_job_name='inventory_monitoring_hpo',
    role=role,
    framework_version="1.8",
    instance_count=1,
    instance_type="ml.g4dn.xlarge",
    py_version='py36',
    output_path = "s3://udacity-capstone-project-2023/hpo-output/"
)

tuner = HyperparameterTuner(
    estimator,
    objective_metric_name,
    hyperparameter_ranges,
    metric_definitions,
    max_jobs=2,
    max_parallel_jobs=1,  
    objective_type=objective_type
)
In [63]:
os.environ['SM_CHANNEL_TRAINING']='s3://udacity-capstone-project-2023/'
os.environ['SM_MODEL_DIR']='s3://udacity-capstone-project-2023/models/'
os.environ['SM_OUTPUT_DATA_DIR']='s3://udacity-capstone-project-2023/output/'
In [22]:
# Fit the estimator
tuner.fit({"training": "s3://udacity-capstone-project-2023/"})
..............................................................................................................................................................................!

Model Training with the best hyperparameters¶

In [23]:
# Find the best hyperparameters
best_estimator = tuner.best_estimator()

#Get the hyperparameters of the best trained model
best_hyperparameters = best_estimator.hyperparameters()
hyperparameters = {"batch_size": int(best_estimator.hyperparameters()['batch_size'].replace('"', '')), \
                   "learning_rate": best_estimator.hyperparameters()['learning_rate'], \
                   "epochs": best_estimator.hyperparameters()['epochs']}

hyperparameters
2023-04-02 17:02:39 Starting - Preparing the instances for training
2023-04-02 17:02:39 Downloading - Downloading input data
2023-04-02 17:02:39 Training - Training image download completed. Training in progress.
2023-04-02 17:02:39 Uploading - Uploading generated training model
2023-04-02 17:02:39 Completed - Resource reused by training job: pytorch-training-230402-1656-002-2daf9942
Out[23]:
{'batch_size': 128, 'learning_rate': '0.06246976097402943', 'epochs': '11'}
In [21]:
hyperparameters = {'batch_size': 128, 'learning_rate': '0.06246976097402943', 'epochs': '25'}

hyperparameters
Out[21]:
{'batch_size': 128, 'learning_rate': '0.06246976097402943', 'epochs': '25'}
In [28]:
#Set up debugging and profiling rules and hooks
rules = [
    Rule.sagemaker(rule_configs.vanishing_gradient()),
    Rule.sagemaker(rule_configs.overfit()),
    Rule.sagemaker(rule_configs.overtraining()),
    Rule.sagemaker(rule_configs.poor_weight_initialization()),
    Rule.sagemaker(rule_configs.loss_not_decreasing()),
    ProfilerRule.sagemaker(rule_configs.LowGPUUtilization()),
    ProfilerRule.sagemaker(rule_configs.ProfilerReport()),
]

hook_config = DebuggerHookConfig(
    hook_parameters = {"train.save_interval": "10", "eval.save_interval": "5"},
    collection_configs = [CollectionConfig(name = "CrossEntropyLoss_output_0",
                                           parameters = {
                                               "include_regex": "CrossEntropyLoss_output_0",
                                               "train.save_interval": "10",
                                               "eval.save_interval": "1"})]
)

profiler_config = ProfilerConfig(
    system_monitor_interval_millis=500,
    framework_profile_params=FrameworkProfile(num_steps=10)
)

Single-Instance Training¶

In [29]:
#adjust this cell to accomplish multi-instance training
estimator = PyTorch(
    entry_point='code/train.py',
    base_job_name='inventory-monitoring',
    role=role,
    instance_count=1,
    instance_type='ml.g4dn.xlarge',
    framework_version='1.8',
    py_version='py36',
    hyperparameters=hyperparameters,
    output_path = "s3://udacity-capstone-project-2023/output-best/",
    ## Debugger and Profiler parameters
    rules = rules,
    debugger_hook_config=hook_config,
    profiler_config=profiler_config,
)
In [30]:
os.environ['SM_CHANNEL_TRAINING']='s3://udacity-capstone-project-2023/'
os.environ['SM_MODEL_DIR']='s3://udacity-capstone-project-2023/models/'
os.environ['SM_OUTPUT_DATA_DIR']='s3://udacity-capstone-project-2023/output/'

estimator.fit({"training": "s3://udacity-capstone-project-2023/"})
2023-04-06 00:01:29 Starting - Starting the training job...
2023-04-06 00:01:58 Starting - Preparing the instances for trainingVanishingGradient: InProgress
Overfit: InProgress
Overtraining: InProgress
PoorWeightInitialization: InProgress
LossNotDecreasing: InProgress
LowGPUUtilization: InProgress
ProfilerReport: InProgress
......
2023-04-06 00:02:58 Downloading - Downloading input data...
2023-04-06 00:03:19 Training - Downloading the training image.....................
2023-04-06 00:07:00 Training - Training image download completed. Training in progress..bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
2023-04-06 00:07:03,398 sagemaker-training-toolkit INFO     Imported framework sagemaker_pytorch_container.training
2023-04-06 00:07:03,430 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.
2023-04-06 00:07:03,433 sagemaker_pytorch_container.training INFO     Invoking user training script.
2023-04-06 00:07:03,674 sagemaker-training-toolkit INFO     Invoking user script
Training Env:
{
    "additional_framework_parameters": {},
    "channel_input_dirs": {
        "training": "/opt/ml/input/data/training"
    },
    "current_host": "algo-1",
    "framework_module": "sagemaker_pytorch_container.training:main",
    "hosts": [
        "algo-1"
    ],
    "hyperparameters": {
        "batch_size": 128,
        "epochs": "25",
        "learning_rate": "0.06246976097402943"
    },
    "input_config_dir": "/opt/ml/input/config",
    "input_data_config": {
        "training": {
            "TrainingInputMode": "File",
            "S3DistributionType": "FullyReplicated",
            "RecordWrapperType": "None"
        }
    },
    "input_dir": "/opt/ml/input",
    "is_master": true,
    "job_name": "inventory-monitoring-2023-04-06-00-01-29-320",
    "log_level": 20,
    "master_hostname": "algo-1",
    "model_dir": "/opt/ml/model",
    "module_dir": "s3://udacity-capstone-project-2023/inventory-monitoring-2023-04-06-00-01-29-320/source/sourcedir.tar.gz",
    "module_name": "train",
    "network_interface_name": "eth0",
    "num_cpus": 4,
    "num_gpus": 1,
    "output_data_dir": "/opt/ml/output/data",
    "output_dir": "/opt/ml/output",
    "output_intermediate_dir": "/opt/ml/output/intermediate",
    "resource_config": {
        "current_host": "algo-1",
        "current_instance_type": "ml.g4dn.xlarge",
        "current_group_name": "homogeneousCluster",
        "hosts": [
            "algo-1"
        ],
        "instance_groups": [
            {
                "instance_group_name": "homogeneousCluster",
                "instance_type": "ml.g4dn.xlarge",
                "hosts": [
                    "algo-1"
                ]
            }
        ],
        "network_interface_name": "eth0"
    },
    "user_entry_point": "train.py"
}
Environment variables:
SM_HOSTS=["algo-1"]
SM_NETWORK_INTERFACE_NAME=eth0
SM_HPS={"batch_size":128,"epochs":"25","learning_rate":"0.06246976097402943"}
SM_USER_ENTRY_POINT=train.py
SM_FRAMEWORK_PARAMS={}
SM_RESOURCE_CONFIG={"current_group_name":"homogeneousCluster","current_host":"algo-1","current_instance_type":"ml.g4dn.xlarge","hosts":["algo-1"],"instance_groups":[{"hosts":["algo-1"],"instance_group_name":"homogeneousCluster","instance_type":"ml.g4dn.xlarge"}],"network_interface_name":"eth0"}
SM_INPUT_DATA_CONFIG={"training":{"RecordWrapperType":"None","S3DistributionType":"FullyReplicated","TrainingInputMode":"File"}}
SM_OUTPUT_DATA_DIR=/opt/ml/output/data
SM_CHANNELS=["training"]
SM_CURRENT_HOST=algo-1
SM_MODULE_NAME=train
SM_LOG_LEVEL=20
SM_FRAMEWORK_MODULE=sagemaker_pytorch_container.training:main
SM_INPUT_DIR=/opt/ml/input
SM_INPUT_CONFIG_DIR=/opt/ml/input/config
SM_OUTPUT_DIR=/opt/ml/output
SM_NUM_CPUS=4
SM_NUM_GPUS=1
SM_MODEL_DIR=/opt/ml/model
SM_MODULE_DIR=s3://udacity-capstone-project-2023/inventory-monitoring-2023-04-06-00-01-29-320/source/sourcedir.tar.gz
SM_TRAINING_ENV={"additional_framework_parameters":{},"channel_input_dirs":{"training":"/opt/ml/input/data/training"},"current_host":"algo-1","framework_module":"sagemaker_pytorch_container.training:main","hosts":["algo-1"],"hyperparameters":{"batch_size":128,"epochs":"25","learning_rate":"0.06246976097402943"},"input_config_dir":"/opt/ml/input/config","input_data_config":{"training":{"RecordWrapperType":"None","S3DistributionType":"FullyReplicated","TrainingInputMode":"File"}},"input_dir":"/opt/ml/input","is_master":true,"job_name":"inventory-monitoring-2023-04-06-00-01-29-320","log_level":20,"master_hostname":"algo-1","model_dir":"/opt/ml/model","module_dir":"s3://udacity-capstone-project-2023/inventory-monitoring-2023-04-06-00-01-29-320/source/sourcedir.tar.gz","module_name":"train","network_interface_name":"eth0","num_cpus":4,"num_gpus":1,"output_data_dir":"/opt/ml/output/data","output_dir":"/opt/ml/output","output_intermediate_dir":"/opt/ml/output/intermediate","resource_config":{"current_group_name":"homogeneousCluster","current_host":"algo-1","current_instance_type":"ml.g4dn.xlarge","hosts":["algo-1"],"instance_groups":[{"hosts":["algo-1"],"instance_group_name":"homogeneousCluster","instance_type":"ml.g4dn.xlarge"}],"network_interface_name":"eth0"},"user_entry_point":"train.py"}
SM_USER_ARGS=["--batch_size","128","--epochs","25","--learning_rate","0.06246976097402943"]
SM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate
SM_CHANNEL_TRAINING=/opt/ml/input/data/training
SM_HP_BATCH_SIZE=128
SM_HP_EPOCHS=25
SM_HP_LEARNING_RATE=0.06246976097402943
PYTHONPATH=/opt/ml/code:/opt/conda/bin:/opt/conda/lib/python36.zip:/opt/conda/lib/python3.6:/opt/conda/lib/python3.6/lib-dynload:/opt/conda/lib/python3.6/site-packages
Invoking script with the following command:
/opt/conda/bin/python3.6 train.py --batch_size 128 --epochs 25 --learning_rate 0.06246976097402943
[2023-04-06 00:07:04.939 algo-1:27 INFO utils.py:27] RULE_JOB_STOP_SIGNAL_FILENAME: None
[2023-04-06 00:07:04.976 algo-1:27 INFO profiler_config_parser.py:102] Using config at /opt/ml/input/config/profilerconfig.json.
Namespace(batch_size=128, data='/opt/ml/input/data/training', epochs=25, learning_rate=0.06246976097402943, learning_rate_decay=10, model_dir='/opt/ml/model', output_dir='/opt/ml/output/data')
Hyperparameters are LR: 0.06246976097402943, Batch Size: 128
Data Paths: /opt/ml/input/data/training
creating model 'resnet34' from checkpoint
[2023-04-06 00:07:09.553 algo-1:27 INFO json_config.py:91] Creating hook from json_config at /opt/ml/input/config/debughookconfig.json.
[2023-04-06 00:07:09.555 algo-1:27 INFO hook.py:201] tensorboard_dir has not been set for the hook. SMDebug will not be exporting tensorboard summaries.
[2023-04-06 00:07:09.556 algo-1:27 INFO hook.py:255] Saving to /opt/ml/output/tensors
[2023-04-06 00:07:09.556 algo-1:27 INFO state_store.py:77] The checkpoint config file /opt/ml/input/config/checkpointconfig.json does not exist.
[2023-04-06 00:07:09.574 algo-1:27 INFO hook.py:591] name:conv1.weight count_params:9408
[2023-04-06 00:07:09.574 algo-1:27 INFO hook.py:591] name:bn1.weight count_params:64
[2023-04-06 00:07:09.575 algo-1:27 INFO hook.py:591] name:bn1.bias count_params:64
[2023-04-06 00:07:09.575 algo-1:27 INFO hook.py:591] name:layer1.0.conv1.weight count_params:36864
[2023-04-06 00:07:09.575 algo-1:27 INFO hook.py:591] name:layer1.0.bn1.weight count_params:64
[2023-04-06 00:07:09.576 algo-1:27 INFO hook.py:591] name:layer1.0.bn1.bias count_params:64
[2023-04-06 00:07:09.576 algo-1:27 INFO hook.py:591] name:layer1.0.conv2.weight count_params:36864
[2023-04-06 00:07:09.576 algo-1:27 INFO hook.py:591] name:layer1.0.bn2.weight count_params:64
[2023-04-06 00:07:09.577 algo-1:27 INFO hook.py:591] name:layer1.0.bn2.bias count_params:64
[2023-04-06 00:07:09.577 algo-1:27 INFO hook.py:591] name:layer1.1.conv1.weight count_params:36864
[2023-04-06 00:07:09.577 algo-1:27 INFO hook.py:591] name:layer1.1.bn1.weight count_params:64
[2023-04-06 00:07:09.578 algo-1:27 INFO hook.py:591] name:layer1.1.bn1.bias count_params:64
[2023-04-06 00:07:09.578 algo-1:27 INFO hook.py:591] name:layer1.1.conv2.weight count_params:36864
[2023-04-06 00:07:09.578 algo-1:27 INFO hook.py:591] name:layer1.1.bn2.weight count_params:64
[2023-04-06 00:07:09.578 algo-1:27 INFO hook.py:591] name:layer1.1.bn2.bias count_params:64
[2023-04-06 00:07:09.579 algo-1:27 INFO hook.py:591] name:layer1.2.conv1.weight count_params:36864
[2023-04-06 00:07:09.579 algo-1:27 INFO hook.py:591] name:layer1.2.bn1.weight count_params:64
[2023-04-06 00:07:09.579 algo-1:27 INFO hook.py:591] name:layer1.2.bn1.bias count_params:64
[2023-04-06 00:07:09.580 algo-1:27 INFO hook.py:591] name:layer1.2.conv2.weight count_params:36864
[2023-04-06 00:07:09.580 algo-1:27 INFO hook.py:591] name:layer1.2.bn2.weight count_params:64
[2023-04-06 00:07:09.580 algo-1:27 INFO hook.py:591] name:layer1.2.bn2.bias count_params:64
[2023-04-06 00:07:09.580 algo-1:27 INFO hook.py:591] name:layer2.0.conv1.weight count_params:73728
[2023-04-06 00:07:09.581 algo-1:27 INFO hook.py:591] name:layer2.0.bn1.weight count_params:128
[2023-04-06 00:07:09.581 algo-1:27 INFO hook.py:591] name:layer2.0.bn1.bias count_params:128
[2023-04-06 00:07:09.581 algo-1:27 INFO hook.py:591] name:layer2.0.conv2.weight count_params:147456
[2023-04-06 00:07:09.582 algo-1:27 INFO hook.py:591] name:layer2.0.bn2.weight count_params:128
[2023-04-06 00:07:09.582 algo-1:27 INFO hook.py:591] name:layer2.0.bn2.bias count_params:128
[2023-04-06 00:07:09.582 algo-1:27 INFO hook.py:591] name:layer2.0.downsample.0.weight count_params:8192
[2023-04-06 00:07:09.583 algo-1:27 INFO hook.py:591] name:layer2.0.downsample.1.weight count_params:128
[2023-04-06 00:07:09.583 algo-1:27 INFO hook.py:591] name:layer2.0.downsample.1.bias count_params:128
[2023-04-06 00:07:09.583 algo-1:27 INFO hook.py:591] name:layer2.1.conv1.weight count_params:147456
[2023-04-06 00:07:09.584 algo-1:27 INFO hook.py:591] name:layer2.1.bn1.weight count_params:128
[2023-04-06 00:07:09.584 algo-1:27 INFO hook.py:591] name:layer2.1.bn1.bias count_params:128
[2023-04-06 00:07:09.584 algo-1:27 INFO hook.py:591] name:layer2.1.conv2.weight count_params:147456
[2023-04-06 00:07:09.584 algo-1:27 INFO hook.py:591] name:layer2.1.bn2.weight count_params:128
[2023-04-06 00:07:09.585 algo-1:27 INFO hook.py:591] name:layer2.1.bn2.bias count_params:128
[2023-04-06 00:07:09.585 algo-1:27 INFO hook.py:591] name:layer2.2.conv1.weight count_params:147456
[2023-04-06 00:07:09.585 algo-1:27 INFO hook.py:591] name:layer2.2.bn1.weight count_params:128
[2023-04-06 00:07:09.586 algo-1:27 INFO hook.py:591] name:layer2.2.bn1.bias count_params:128
[2023-04-06 00:07:09.586 algo-1:27 INFO hook.py:591] name:layer2.2.conv2.weight count_params:147456
[2023-04-06 00:07:09.586 algo-1:27 INFO hook.py:591] name:layer2.2.bn2.weight count_params:128
[2023-04-06 00:07:09.586 algo-1:27 INFO hook.py:591] name:layer2.2.bn2.bias count_params:128
[2023-04-06 00:07:09.587 algo-1:27 INFO hook.py:591] name:layer2.3.conv1.weight count_params:147456
[2023-04-06 00:07:09.587 algo-1:27 INFO hook.py:591] name:layer2.3.bn1.weight count_params:128
[2023-04-06 00:07:09.587 algo-1:27 INFO hook.py:591] name:layer2.3.bn1.bias count_params:128
[2023-04-06 00:07:09.588 algo-1:27 INFO hook.py:591] name:layer2.3.conv2.weight count_params:147456
[2023-04-06 00:07:09.588 algo-1:27 INFO hook.py:591] name:layer2.3.bn2.weight count_params:128
[2023-04-06 00:07:09.588 algo-1:27 INFO hook.py:591] name:layer2.3.bn2.bias count_params:128
[2023-04-06 00:07:09.589 algo-1:27 INFO hook.py:591] name:layer3.0.conv1.weight count_params:294912
[2023-04-06 00:07:09.589 algo-1:27 INFO hook.py:591] name:layer3.0.bn1.weight count_params:256
[2023-04-06 00:07:09.589 algo-1:27 INFO hook.py:591] name:layer3.0.bn1.bias count_params:256
[2023-04-06 00:07:09.590 algo-1:27 INFO hook.py:591] name:layer3.0.conv2.weight count_params:589824
[2023-04-06 00:07:09.590 algo-1:27 INFO hook.py:591] name:layer3.0.bn2.weight count_params:256
[2023-04-06 00:07:09.590 algo-1:27 INFO hook.py:591] name:layer3.0.bn2.bias count_params:256
[2023-04-06 00:07:09.590 algo-1:27 INFO hook.py:591] name:layer3.0.downsample.0.weight count_params:32768
[2023-04-06 00:07:09.591 algo-1:27 INFO hook.py:591] name:layer3.0.downsample.1.weight count_params:256
[2023-04-06 00:07:09.591 algo-1:27 INFO hook.py:591] name:layer3.0.downsample.1.bias count_params:256
[2023-04-06 00:07:09.591 algo-1:27 INFO hook.py:591] name:layer3.1.conv1.weight count_params:589824
[2023-04-06 00:07:09.592 algo-1:27 INFO hook.py:591] name:layer3.1.bn1.weight count_params:256
[2023-04-06 00:07:09.592 algo-1:27 INFO hook.py:591] name:layer3.1.bn1.bias count_params:256
[2023-04-06 00:07:09.592 algo-1:27 INFO hook.py:591] name:layer3.1.conv2.weight count_params:589824
[2023-04-06 00:07:09.593 algo-1:27 INFO hook.py:591] name:layer3.1.bn2.weight count_params:256
[2023-04-06 00:07:09.593 algo-1:27 INFO hook.py:591] name:layer3.1.bn2.bias count_params:256
[2023-04-06 00:07:09.593 algo-1:27 INFO hook.py:591] name:layer3.2.conv1.weight count_params:589824
[2023-04-06 00:07:09.593 algo-1:27 INFO hook.py:591] name:layer3.2.bn1.weight count_params:256
[2023-04-06 00:07:09.594 algo-1:27 INFO hook.py:591] name:layer3.2.bn1.bias count_params:256
[2023-04-06 00:07:09.594 algo-1:27 INFO hook.py:591] name:layer3.2.conv2.weight count_params:589824
[2023-04-06 00:07:09.594 algo-1:27 INFO hook.py:591] name:layer3.2.bn2.weight count_params:256
[2023-04-06 00:07:09.595 algo-1:27 INFO hook.py:591] name:layer3.2.bn2.bias count_params:256
[2023-04-06 00:07:09.595 algo-1:27 INFO hook.py:591] name:layer3.3.conv1.weight count_params:589824
[2023-04-06 00:07:09.595 algo-1:27 INFO hook.py:591] name:layer3.3.bn1.weight count_params:256
[2023-04-06 00:07:09.595 algo-1:27 INFO hook.py:591] name:layer3.3.bn1.bias count_params:256
[2023-04-06 00:07:09.596 algo-1:27 INFO hook.py:591] name:layer3.3.conv2.weight count_params:589824
[2023-04-06 00:07:09.596 algo-1:27 INFO hook.py:591] name:layer3.3.bn2.weight count_params:256
[2023-04-06 00:07:09.596 algo-1:27 INFO hook.py:591] name:layer3.3.bn2.bias count_params:256
[2023-04-06 00:07:09.597 algo-1:27 INFO hook.py:591] name:layer3.4.conv1.weight count_params:589824
[2023-04-06 00:07:09.597 algo-1:27 INFO hook.py:591] name:layer3.4.bn1.weight count_params:256
[2023-04-06 00:07:09.597 algo-1:27 INFO hook.py:591] name:layer3.4.bn1.bias count_params:256
[2023-04-06 00:07:09.598 algo-1:27 INFO hook.py:591] name:layer3.4.conv2.weight count_params:589824
[2023-04-06 00:07:09.598 algo-1:27 INFO hook.py:591] name:layer3.4.bn2.weight count_params:256
[2023-04-06 00:07:09.598 algo-1:27 INFO hook.py:591] name:layer3.4.bn2.bias count_params:256
[2023-04-06 00:07:09.598 algo-1:27 INFO hook.py:591] name:layer3.5.conv1.weight count_params:589824
[2023-04-06 00:07:09.599 algo-1:27 INFO hook.py:591] name:layer3.5.bn1.weight count_params:256
[2023-04-06 00:07:09.599 algo-1:27 INFO hook.py:591] name:layer3.5.bn1.bias count_params:256
[2023-04-06 00:07:09.599 algo-1:27 INFO hook.py:591] name:layer3.5.conv2.weight count_params:589824
[2023-04-06 00:07:09.600 algo-1:27 INFO hook.py:591] name:layer3.5.bn2.weight count_params:256
[2023-04-06 00:07:09.600 algo-1:27 INFO hook.py:591] name:layer3.5.bn2.bias count_params:256
[2023-04-06 00:07:09.600 algo-1:27 INFO hook.py:591] name:layer4.0.conv1.weight count_params:1179648
[2023-04-06 00:07:09.601 algo-1:27 INFO hook.py:591] name:layer4.0.bn1.weight count_params:512
[2023-04-06 00:07:09.601 algo-1:27 INFO hook.py:591] name:layer4.0.bn1.bias count_params:512
[2023-04-06 00:07:09.601 algo-1:27 INFO hook.py:591] name:layer4.0.conv2.weight count_params:2359296
[2023-04-06 00:07:09.602 algo-1:27 INFO hook.py:591] name:layer4.0.bn2.weight count_params:512
[2023-04-06 00:07:09.602 algo-1:27 INFO hook.py:591] name:layer4.0.bn2.bias count_params:512
[2023-04-06 00:07:09.602 algo-1:27 INFO hook.py:591] name:layer4.0.downsample.0.weight count_params:131072
[2023-04-06 00:07:09.602 algo-1:27 INFO hook.py:591] name:layer4.0.downsample.1.weight count_params:512
[2023-04-06 00:07:09.603 algo-1:27 INFO hook.py:591] name:layer4.0.downsample.1.bias count_params:512
[2023-04-06 00:07:09.603 algo-1:27 INFO hook.py:591] name:layer4.1.conv1.weight count_params:2359296
[2023-04-06 00:07:09.603 algo-1:27 INFO hook.py:591] name:layer4.1.bn1.weight count_params:512
[2023-04-06 00:07:09.604 algo-1:27 INFO hook.py:591] name:layer4.1.bn1.bias count_params:512
[2023-04-06 00:07:09.604 algo-1:27 INFO hook.py:591] name:layer4.1.conv2.weight count_params:2359296
[2023-04-06 00:07:09.604 algo-1:27 INFO hook.py:591] name:layer4.1.bn2.weight count_params:512
[2023-04-06 00:07:09.605 algo-1:27 INFO hook.py:591] name:layer4.1.bn2.bias count_params:512
[2023-04-06 00:07:09.605 algo-1:27 INFO hook.py:591] name:layer4.2.conv1.weight count_params:2359296
[2023-04-06 00:07:09.605 algo-1:27 INFO hook.py:591] name:layer4.2.bn1.weight count_params:512
[2023-04-06 00:07:09.605 algo-1:27 INFO hook.py:591] name:layer4.2.bn1.bias count_params:512
[2023-04-06 00:07:09.606 algo-1:27 INFO hook.py:591] name:layer4.2.conv2.weight count_params:2359296
[2023-04-06 00:07:09.606 algo-1:27 INFO hook.py:591] name:layer4.2.bn2.weight count_params:512
[2023-04-06 00:07:09.606 algo-1:27 INFO hook.py:591] name:layer4.2.bn2.bias count_params:512
[2023-04-06 00:07:09.607 algo-1:27 INFO hook.py:591] name:fc.weight count_params:3072
[2023-04-06 00:07:09.607 algo-1:27 INFO hook.py:591] name:fc.bias count_params:6
[2023-04-06 00:07:09.607 algo-1:27 INFO hook.py:593] Total Trainable Params: 21287750
Starting Model Training
Epoch: 0
[2023-04-06 00:07:10.642 algo-1:27 INFO hook.py:425] Monitoring the collections: CrossEntropyLoss_output_0, gradients, relu_input, losses
[2023-04-06 00:07:10.643 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/prestepzero-*-start-1680739624976959.5_train-0-stepstart-1680739630643270.5/python_stats.
[2023-04-06 00:07:10.708 algo-1:27 INFO hook.py:488] Hook is writing from the hook with pid: 27
[2023-04-06 00:07:27.483 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-0-stepstart-1680739630705988.2_train-0-forwardpassend-1680739647482880.2/python_stats.
Epoch: [0][0/12] lr 0.06247#011Time 19.500 (19.500)#011Data 1.012 (1.012)#011Loss 3.2178 (3.2178)#011Prec 0.250 (0.250)
[2023-04-06 00:07:30.302 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-0-forwardpassend-1680739647485890.2_train-1-stepstart-1680739650301890.2/python_stats.
[2023-04-06 00:07:37.413 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-1-stepstart-1680739650308913.0_train-1-forwardpassend-1680739657412731.2/python_stats.
Epoch: [0][1/12] lr 0.06247#011Time 8.748 (14.124)#011Data 1.155 (1.083)#011Loss 2.9840 (3.1009)#011Prec 0.273 (0.262)
[2023-04-06 00:07:39.239 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-1-forwardpassend-1680739657415124.8_train-2-stepstart-1680739659238054.0/python_stats.
[2023-04-06 00:07:45.212 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-2-stepstart-1680739659244394.8_train-2-forwardpassend-1680739665212061.2/python_stats.
Epoch: [0][2/12] lr 0.06247#011Time 7.803 (12.017)#011Data 1.343 (1.170)#011Loss 2.3679 (2.8566)#011Prec 0.305 (0.276)
[2023-04-06 00:07:46.742 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-2-forwardpassend-1680739665213824.8_train-3-stepstart-1680739666742085.0/python_stats.
[2023-04-06 00:07:51.905 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-3-stepstart-1680739666745396.8_train-3-forwardpassend-1680739671905571.8/python_stats.
Epoch: [0][3/12] lr 0.06247#011Time 6.696 (10.687)#011Data 1.049 (1.140)#011Loss 2.0976 (2.6668)#011Prec 0.383 (0.303)
[2023-04-06 00:07:53.376 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-3-forwardpassend-1680739671907266.5_train-4-stepstart-1680739673375310.2/python_stats.
[2023-04-06 00:07:58.745 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-4-stepstart-1680739673380186.0_train-4-forwardpassend-1680739678745104.0/python_stats.
Epoch: [0][4/12] lr 0.06247#011Time 6.838 (9.917)#011Data 0.986 (1.109)#011Loss 1.9921 (2.5319)#011Prec 0.234 (0.289)
[2023-04-06 00:08:00.295 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-4-forwardpassend-1680739678746906.0_train-5-stepstart-1680739680294494.5/python_stats.
[2023-04-06 00:08:05.578 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-5-stepstart-1680739680301424.2_train-5-forwardpassend-1680739685577604.2/python_stats.
Epoch: [0][5/12] lr 0.06247#011Time 6.834 (9.403)#011Data 1.068 (1.102)#011Loss 1.8697 (2.4215)#011Prec 0.227 (0.279)
[2023-04-06 00:08:07.078 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-5-forwardpassend-1680739685579416.5_train-6-stepstart-1680739687077519.0/python_stats.
[2023-04-06 00:08:12.312 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-6-stepstart-1680739687080988.5_train-6-forwardpassend-1680739692312000.5/python_stats.
Epoch: [0][6/12] lr 0.06247#011Time 6.733 (9.022)#011Data 1.017 (1.090)#011Loss 1.6290 (2.3083)#011Prec 0.297 (0.281)
[2023-04-06 00:08:13.918 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-6-forwardpassend-1680739692313778.2_train-7-stepstart-1680739693917710.2/python_stats.
[2023-04-06 00:08:19.098 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-7-stepstart-1680739693924035.8_train-7-forwardpassend-1680739699098006.5/python_stats.
Epoch: [0][7/12] lr 0.06247#011Time 6.789 (8.743)#011Data 1.124 (1.094)#011Loss 1.7336 (2.2365)#011Prec 0.266 (0.279)
[2023-04-06 00:08:20.740 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-7-forwardpassend-1680739699100020.5_train-8-stepstart-1680739700740053.8/python_stats.
[2023-04-06 00:08:26.070 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-8-stepstart-1680739700743111.2_train-8-forwardpassend-1680739706069677.0/python_stats.
Epoch: [0][8/12] lr 0.06247#011Time 6.971 (8.546)#011Data 1.158 (1.101)#011Loss 1.6927 (2.1761)#011Prec 0.211 (0.272)
[2023-04-06 00:08:27.638 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-8-forwardpassend-1680739706071532.8_train-9-stepstart-1680739707638060.5/python_stats.
[2023-04-06 00:08:32.913 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-9-stepstart-1680739707641703.0_train-9-forwardpassend-1680739712913114.8/python_stats.
Epoch: [0][9/12] lr 0.06247#011Time 6.845 (8.376)#011Data 1.086 (1.100)#011Loss 1.6193 (2.1204)#011Prec 0.266 (0.271)
[2023-04-06 00:08:34.470 algo-1:27 INFO python_profiler.py:182] Dumping cProfile stats to /opt/ml/output/profiler/framework/pytorch/cprofile/27-algo-1/train-9-forwardpassend-1680739712914816.8_train-10-stepstart-1680739714470087.0/python_stats.
Epoch: [0][10/12] lr 0.06247#011Time 4.079 (7.985)#011Data 1.073 (1.097)#011Loss 1.4999 (2.0640)#011Prec 0.336 (0.277)
Epoch: [0][11/12] lr 0.06247#011Time 1.255 (7.424)#011Data 0.384 (1.038)#011Loss 1.6476 (2.0489)#011Prec 0.189 (0.274)
validation: [0/18]#011Time 7.702 (7.702)#011Loss 4.8511 (4.8511)#011Prec 0.117 (0.117)
validation: [1/18]#011Time 1.176 (4.439)#011Loss 4.5370 (4.6941)#011Prec 0.164 (0.141)
validation: [2/18]#011Time 1.223 (3.367)#011Loss 5.2358 (4.8747)#011Prec 0.102 (0.128)
validation: [3/18]#011Time 1.274 (2.844)#011Loss 4.9848 (4.9022)#011Prec 0.141 (0.131)
validation: [4/18]#011Time 1.177 (2.510)#011Loss 4.9384 (4.9094)#011Prec 0.102 (0.125)
validation: [5/18]#011Time 1.314 (2.311)#011Loss 4.9891 (4.9227)#011Prec 0.148 (0.129)
validation: [6/18]#011Time 1.196 (2.152)#011Loss 4.9236 (4.9228)#011Prec 0.117 (0.127)
validation: [7/18]#011Time 1.190 (2.031)#011Loss 5.0600 (4.9400)#011Prec 0.125 (0.127)
validation: [8/18]#011Time 1.276 (1.947)#011Loss 5.1309 (4.9612)#011Prec 0.078 (0.122)
validation: [9/18]#011Time 1.312 (1.884)#011Loss 5.2566 (4.9907)#011Prec 0.109 (0.120)
validation: [10/18]#011Time 1.258 (1.827)#011Loss 4.9745 (4.9893)#011Prec 0.156 (0.124)
validation: [11/18]#011Time 1.240 (1.778)#011Loss 4.6565 (4.9615)#011Prec 0.141 (0.125)
validation: [12/18]#011Time 1.284 (1.740)#011Loss 5.3462 (4.9911)#011Prec 0.102 (0.123)
validation: [13/18]#011Time 1.169 (1.699)#011Loss 4.2942 (4.9413)#011Prec 0.180 (0.127)
validation: [14/18]#011Time 1.259 (1.670)#011Loss 5.0822 (4.9507)#011Prec 0.094 (0.125)
validation: [15/18]#011Time 1.003 (1.628)#011Loss 4.7374 (4.9374)#011Prec 0.117 (0.125)
validation: [16/18]#011Time 1.097 (1.597)#011Loss 4.8701 (4.9334)#011Prec 0.148 (0.126)
validation: [17/18]#011Time 0.296 (1.525)#011Loss 4.0670 (4.9255)#011Prec 0.050 (0.125)
*Validation Precision: 0.125
Epoch: 1
Epoch: [1][0/12] lr 0.06247#011Time 1.352 (1.352)#011Data 0.813 (0.813)#011Loss 1.5312 (1.5312)#011Prec 0.258 (0.258)
Epoch: [1][1/12] lr 0.06247#011Time 1.307 (1.329)#011Data 0.767 (0.790)#011Loss 1.5818 (1.5565)#011Prec 0.258 (0.258)
Epoch: [1][2/12] lr 0.06247#011Time 1.349 (1.336)#011Data 0.811 (0.797)#011Loss 1.4456 (1.5195)#011Prec 0.320 (0.279)
Epoch: [1][3/12] lr 0.06247#011Time 1.311 (1.330)#011Data 0.773 (0.791)#011Loss 1.5921 (1.5377)#011Prec 0.250 (0.271)
Epoch: [1][4/12] lr 0.06247#011Time 1.276 (1.319)#011Data 0.737 (0.780)#011Loss 1.5576 (1.5417)#011Prec 0.320 (0.281)
Epoch: [1][5/12] lr 0.06247#011Time 1.388 (1.330)#011Data 0.849 (0.792)#011Loss 1.4521 (1.5267)#011Prec 0.328 (0.289)
Epoch: [1][6/12] lr 0.06247#011Time 1.316 (1.328)#011Data 0.776 (0.789)#011Loss 1.5770 (1.5339)#011Prec 0.281 (0.288)
Epoch: [1][7/12] lr 0.06247#011Time 1.266 (1.321)#011Data 0.727 (0.782)#011Loss 1.5534 (1.5364)#011Prec 0.320 (0.292)
Epoch: [1][8/12] lr 0.06247#011Time 1.399 (1.329)#011Data 0.856 (0.790)#011Loss 1.4558 (1.5274)#011Prec 0.281 (0.291)
Epoch: [1][9/12] lr 0.06247#011Time 1.279 (1.324)#011Data 0.738 (0.785)#011Loss 1.5353 (1.5282)#011Prec 0.305 (0.292)
Epoch: [1][10/12] lr 0.06247#011Time 1.266 (1.319)#011Data 0.727 (0.779)#011Loss 1.4830 (1.5241)#011Prec 0.320 (0.295)
Epoch: [1][11/12] lr 0.06247#011Time 0.584 (1.258)#011Data 0.341 (0.743)#011Loss 1.4188 (1.5203)#011Prec 0.415 (0.299)
validation: [0/18]#011Time 1.052 (1.052)#011Loss 1.4819 (1.4819)#011Prec 0.297 (0.297)
validation: [1/18]#011Time 1.019 (1.036)#011Loss 1.4614 (1.4716)#011Prec 0.359 (0.328)
validation: [2/18]#011Time 0.981 (1.018)#011Loss 1.5771 (1.5068)#011Prec 0.234 (0.297)
validation: [3/18]#011Time 1.031 (1.021)#011Loss 1.4631 (1.4959)#011Prec 0.328 (0.305)
validation: [4/18]#011Time 1.108 (1.039)#011Loss 1.5529 (1.5073)#011Prec 0.305 (0.305)
validation: [5/18]#011Time 0.992 (1.031)#011Loss 1.5955 (1.5220)#011Prec 0.273 (0.299)
validation: [6/18]#011Time 1.016 (1.029)#011Loss 1.5339 (1.5237)#011Prec 0.258 (0.294)
validation: [7/18]#011Time 0.986 (1.023)#011Loss 1.5520 (1.5272)#011Prec 0.305 (0.295)
validation: [8/18]#011Time 0.999 (1.021)#011Loss 1.4813 (1.5221)#011Prec 0.352 (0.301)
validation: [9/18]#011Time 0.954 (1.014)#011Loss 1.6677 (1.5367)#011Prec 0.234 (0.295)
validation: [10/18]#011Time 0.982 (1.011)#011Loss 1.4945 (1.5329)#011Prec 0.320 (0.297)
validation: [11/18]#011Time 1.016 (1.011)#011Loss 1.4904 (1.5293)#011Prec 0.320 (0.299)
validation: [12/18]#011Time 1.011 (1.011)#011Loss 1.5466 (1.5306)#011Prec 0.312 (0.300)
validation: [13/18]#011Time 1.005 (1.011)#011Loss 1.4602 (1.5256)#011Prec 0.289 (0.299)
validation: [14/18]#011Time 1.036 (1.013)#011Loss 1.5589 (1.5278)#011Prec 0.273 (0.297)
validation: [15/18]#011Time 1.037 (1.014)#011Loss 1.4176 (1.5209)#011Prec 0.328 (0.299)
validation: [16/18]#011Time 1.091 (1.019)#011Loss 1.4754 (1.5183)#011Prec 0.289 (0.299)
validation: [17/18]#011Time 0.195 (0.973)#011Loss 1.2545 (1.5159)#011Prec 0.400 (0.300)
*Validation Precision: 0.300
Epoch: 2
Epoch: [2][0/12] lr 0.06247#011Time 1.376 (1.376)#011Data 0.831 (0.831)#011Loss 1.4699 (1.4699)#011Prec 0.227 (0.227)
Epoch: [2][1/12] lr 0.06247#011Time 1.334 (1.355)#011Data 0.793 (0.812)#011Loss 1.4404 (1.4552)#011Prec 0.414 (0.320)
Epoch: [2][2/12] lr 0.06247#011Time 1.283 (1.331)#011Data 0.740 (0.788)#011Loss 1.4235 (1.4446)#011Prec 0.344 (0.328)
Epoch: [2][3/12] lr 0.06247#011Time 1.366 (1.340)#011Data 0.824 (0.797)#011Loss 1.3653 (1.4248)#011Prec 0.375 (0.340)
Epoch: [2][4/12] lr 0.06247#011Time 1.264 (1.325)#011Data 0.721 (0.782)#011Loss 1.4450 (1.4288)#011Prec 0.344 (0.341)
Epoch: [2][5/12] lr 0.06247#011Time 1.324 (1.325)#011Data 0.780 (0.781)#011Loss 1.4627 (1.4345)#011Prec 0.289 (0.332)
Epoch: [2][6/12] lr 0.06247#011Time 1.366 (1.330)#011Data 0.819 (0.787)#011Loss 1.4133 (1.4315)#011Prec 0.305 (0.328)
Epoch: [2][7/12] lr 0.06247#011Time 1.312 (1.328)#011Data 0.768 (0.785)#011Loss 1.4130 (1.4291)#011Prec 0.320 (0.327)
Epoch: [2][8/12] lr 0.06247#011Time 1.307 (1.326)#011Data 0.763 (0.782)#011Loss 1.4214 (1.4283)#011Prec 0.289 (0.323)
Epoch: [2][9/12] lr 0.06247#011Time 1.314 (1.325)#011Data 0.770 (0.781)#011Loss 1.4138 (1.4268)#011Prec 0.375 (0.328)
Epoch: [2][10/12] lr 0.06247#011Time 1.342 (1.326)#011Data 0.800 (0.783)#011Loss 1.4129 (1.4256)#011Prec 0.344 (0.330)
Epoch: [2][11/12] lr 0.06247#011Time 0.574 (1.263)#011Data 0.331 (0.745)#011Loss 1.4487 (1.4264)#011Prec 0.340 (0.330)
validation: [0/18]#011Time 1.005 (1.005)#011Loss 1.8603 (1.8603)#011Prec 0.242 (0.242)
validation: [1/18]#011Time 0.992 (0.999)#011Loss 1.6954 (1.7778)#011Prec 0.281 (0.262)
validation: [2/18]#011Time 1.047 (1.015)#011Loss 1.8486 (1.8014)#011Prec 0.266 (0.263)
validation: [3/18]#011Time 1.046 (1.023)#011Loss 1.9671 (1.8428)#011Prec 0.203 (0.248)
validation: [4/18]#011Time 0.992 (1.016)#011Loss 1.7837 (1.8310)#011Prec 0.258 (0.250)
validation: [5/18]#011Time 0.939 (1.004)#011Loss 1.8411 (1.8327)#011Prec 0.164 (0.236)
validation: [6/18]#011Time 1.040 (1.009)#011Loss 1.8745 (1.8387)#011Prec 0.273 (0.241)
validation: [7/18]#011Time 1.136 (1.025)#011Loss 1.7256 (1.8245)#011Prec 0.227 (0.239)
validation: [8/18]#011Time 1.187 (1.043)#011Loss 1.7620 (1.8176)#011Prec 0.297 (0.246)
validation: [9/18]#011Time 1.016 (1.040)#011Loss 1.9055 (1.8264)#011Prec 0.180 (0.239)
validation: [10/18]#011Time 0.987 (1.035)#011Loss 1.7383 (1.8184)#011Prec 0.250 (0.240)
validation: [11/18]#011Time 1.015 (1.034)#011Loss 1.9926 (1.8329)#011Prec 0.164 (0.234)
validation: [12/18]#011Time 0.977 (1.029)#011Loss 1.7837 (1.8291)#011Prec 0.258 (0.236)
validation: [13/18]#011Time 0.987 (1.026)#011Loss 1.8528 (1.8308)#011Prec 0.227 (0.235)
validation: [14/18]#011Time 0.972 (1.023)#011Loss 1.8415 (1.8315)#011Prec 0.164 (0.230)
validation: [15/18]#011Time 0.988 (1.020)#011Loss 1.8749 (1.8342)#011Prec 0.258 (0.232)
validation: [16/18]#011Time 0.988 (1.018)#011Loss 1.6940 (1.8260)#011Prec 0.234 (0.232)
validation: [17/18]#011Time 0.200 (0.973)#011Loss 2.3051 (1.8303)#011Prec 0.050 (0.230)
*Validation Precision: 0.230
Epoch: 3
Epoch: [3][0/12] lr 0.06247#011Time 1.310 (1.310)#011Data 0.763 (0.763)#011Loss 1.3060 (1.3060)#011Prec 0.445 (0.445)
Epoch: [3][1/12] lr 0.06247#011Time 1.345 (1.327)#011Data 0.799 (0.781)#011Loss 1.3282 (1.3171)#011Prec 0.461 (0.453)
Epoch: [3][2/12] lr 0.06247#011Time 1.282 (1.312)#011Data 0.737 (0.766)#011Loss 1.4114 (1.3485)#011Prec 0.359 (0.422)
Epoch: [3][3/12] lr 0.06247#011Time 1.325 (1.315)#011Data 0.779 (0.770)#011Loss 1.3925 (1.3595)#011Prec 0.422 (0.422)
Epoch: [3][4/12] lr 0.06247#011Time 1.301 (1.313)#011Data 0.752 (0.766)#011Loss 1.3760 (1.3628)#011Prec 0.328 (0.403)
Epoch: [3][5/12] lr 0.06247#011Time 1.334 (1.316)#011Data 0.787 (0.770)#011Loss 1.4455 (1.3766)#011Prec 0.320 (0.389)
Epoch: [3][6/12] lr 0.06247#011Time 1.243 (1.306)#011Data 0.698 (0.759)#011Loss 1.3645 (1.3749)#011Prec 0.438 (0.396)
Epoch: [3][7/12] lr 0.06247#011Time 1.374 (1.314)#011Data 0.827 (0.768)#011Loss 1.4108 (1.3794)#011Prec 0.359 (0.392)
Epoch: [3][8/12] lr 0.06247#011Time 1.351 (1.318)#011Data 0.806 (0.772)#011Loss 1.3222 (1.3730)#011Prec 0.352 (0.387)
Epoch: [3][9/12] lr 0.06247#011Time 1.291 (1.316)#011Data 0.744 (0.769)#011Loss 1.4969 (1.3854)#011Prec 0.320 (0.380)
Epoch: [3][10/12] lr 0.06247#011Time 1.334 (1.317)#011Data 0.786 (0.771)#011Loss 1.4520 (1.3915)#011Prec 0.297 (0.373)
VanishingGradient: InProgress
Overfit: InProgress
Overtraining: IssuesFound
PoorWeightInitialization: InProgress
LossNotDecreasing: InProgress
Epoch: [3][11/12] lr 0.06247#011Time 0.550 (1.253)#011Data 0.306 (0.732)#011Loss 1.2630 (1.3868)#011Prec 0.396 (0.374)
validation: [0/18]#011Time 0.985 (0.985)#011Loss 1.2623 (1.2623)#011Prec 0.438 (0.438)
validation: [1/18]#011Time 1.037 (1.011)#011Loss 1.3266 (1.2944)#011Prec 0.336 (0.387)
validation: [2/18]#011Time 0.994 (1.005)#011Loss 1.2607 (1.2832)#011Prec 0.414 (0.396)
validation: [3/18]#011Time 1.023 (1.010)#011Loss 1.3614 (1.3027)#011Prec 0.398 (0.396)
validation: [4/18]#011Time 1.064 (1.020)#011Loss 1.3333 (1.3088)#011Prec 0.352 (0.388)
validation: [5/18]#011Time 1.076 (1.030)#011Loss 1.2737 (1.3030)#011Prec 0.383 (0.387)
validation: [6/18]#011Time 1.007 (1.026)#011Loss 1.3865 (1.3149)#011Prec 0.328 (0.378)
validation: [7/18]#011Time 1.014 (1.025)#011Loss 1.3058 (1.3138)#011Prec 0.359 (0.376)
validation: [8/18]#011Time 1.040 (1.027)#011Loss 1.2784 (1.3099)#011Prec 0.383 (0.377)
validation: [9/18]#011Time 1.074 (1.031)#011Loss 1.2523 (1.3041)#011Prec 0.398 (0.379)
validation: [10/18]#011Time 1.012 (1.030)#011Loss 1.3013 (1.3039)#011Prec 0.383 (0.379)
validation: [11/18]#011Time 1.113 (1.037)#011Loss 1.3442 (1.3072)#011Prec 0.320 (0.374)
validation: [12/18]#011Time 0.988 (1.033)#011Loss 1.3563 (1.3110)#011Prec 0.375 (0.374)
validation: [13/18]#011Time 1.001 (1.031)#011Loss 1.3268 (1.3121)#011Prec 0.367 (0.374)
validation: [14/18]#011Time 0.982 (1.027)#011Loss 1.3299 (1.3133)#011Prec 0.391 (0.375)
validation: [15/18]#011Time 1.060 (1.029)#011Loss 1.3276 (1.3142)#011Prec 0.336 (0.373)
validation: [16/18]#011Time 0.977 (1.026)#011Loss 1.2961 (1.3131)#011Prec 0.398 (0.374)
validation: [17/18]#011Time 0.157 (0.978)#011Loss 1.1565 (1.3117)#011Prec 0.450 (0.375)
*Validation Precision: 0.375
Epoch: 4
Epoch: [4][0/12] lr 0.06247#011Time 1.321 (1.321)#011Data 0.775 (0.775)#011Loss 1.2652 (1.2652)#011Prec 0.336 (0.336)
Epoch: [4][1/12] lr 0.06247#011Time 1.377 (1.349)#011Data 0.828 (0.802)#011Loss 1.3533 (1.3093)#011Prec 0.336 (0.336)
Epoch: [4][2/12] lr 0.06247#011Time 1.422 (1.373)#011Data 0.873 (0.825)#011Loss 1.3203 (1.3130)#011Prec 0.344 (0.339)
Epoch: [4][3/12] lr 0.06247#011Time 1.350 (1.368)#011Data 0.803 (0.820)#011Loss 1.4048 (1.3359)#011Prec 0.344 (0.340)
Epoch: [4][4/12] lr 0.06247#011Time 1.327 (1.360)#011Data 0.778 (0.811)#011Loss 1.3369 (1.3361)#011Prec 0.430 (0.358)
Epoch: [4][5/12] lr 0.06247#011Time 1.269 (1.345)#011Data 0.718 (0.796)#011Loss 1.3348 (1.3359)#011Prec 0.328 (0.353)
Epoch: [4][6/12] lr 0.06247#011Time 1.317 (1.341)#011Data 0.768 (0.792)#011Loss 1.4581 (1.3533)#011Prec 0.375 (0.356)
Epoch: [4][7/12] lr 0.06247#011Time 1.337 (1.340)#011Data 0.787 (0.791)#011Loss 1.3829 (1.3570)#011Prec 0.367 (0.357)
Epoch: [4][8/12] lr 0.06247#011Time 1.335 (1.340)#011Data 0.785 (0.791)#011Loss 1.3406 (1.3552)#011Prec 0.352 (0.357)
Epoch: [4][9/12] lr 0.06247#011Time 1.335 (1.339)#011Data 0.785 (0.790)#011Loss 1.2779 (1.3475)#011Prec 0.375 (0.359)
Epoch: [4][10/12] lr 0.06247#011Time 1.325 (1.338)#011Data 0.777 (0.789)#011Loss 1.3854 (1.3509)#011Prec 0.352 (0.358)
Epoch: [4][11/12] lr 0.06247#011Time 0.550 (1.272)#011Data 0.304 (0.748)#011Loss 1.4509 (1.3546)#011Prec 0.283 (0.355)
validation: [0/18]#011Time 0.971 (0.971)#011Loss 1.2855 (1.2855)#011Prec 0.328 (0.328)
validation: [1/18]#011Time 1.082 (1.026)#011Loss 1.2639 (1.2747)#011Prec 0.445 (0.387)
validation: [2/18]#011Time 1.183 (1.079)#011Loss 1.3205 (1.2900)#011Prec 0.367 (0.380)
validation: [3/18]#011Time 0.966 (1.050)#011Loss 1.2947 (1.2912)#011Prec 0.383 (0.381)
validation: [4/18]#011Time 0.998 (1.040)#011Loss 1.2832 (1.2896)#011Prec 0.367 (0.378)
validation: [5/18]#011Time 1.006 (1.034)#011Loss 1.3434 (1.2986)#011Prec 0.383 (0.379)
validation: [6/18]#011Time 1.022 (1.033)#011Loss 1.3146 (1.3009)#011Prec 0.406 (0.383)
validation: [7/18]#011Time 1.080 (1.039)#011Loss 1.3168 (1.3028)#011Prec 0.344 (0.378)
validation: [8/18]#011Time 0.971 (1.031)#011Loss 1.3163 (1.3043)#011Prec 0.375 (0.378)
validation: [9/18]#011Time 1.028 (1.031)#011Loss 1.2877 (1.3027)#011Prec 0.359 (0.376)
validation: [10/18]#011Time 0.992 (1.027)#011Loss 1.2853 (1.3011)#011Prec 0.359 (0.374)
validation: [11/18]#011Time 1.101 (1.033)#011Loss 1.2752 (1.2989)#011Prec 0.352 (0.372)
validation: [12/18]#011Time 0.983 (1.030)#011Loss 1.2570 (1.2957)#011Prec 0.344 (0.370)
validation: [13/18]#011Time 0.980 (1.026)#011Loss 1.2343 (1.2913)#011Prec 0.453 (0.376)
validation: [14/18]#011Time 1.026 (1.026)#011Loss 1.2945 (1.2915)#011Prec 0.352 (0.374)
validation: [15/18]#011Time 1.181 (1.036)#011Loss 1.3294 (1.2939)#011Prec 0.367 (0.374)
validation: [16/18]#011Time 1.098 (1.039)#011Loss 1.2775 (1.2929)#011Prec 0.328 (0.371)
validation: [17/18]#011Time 0.185 (0.992)#011Loss 1.2798 (1.2928)#011Prec 0.400 (0.372)
*Validation Precision: 0.372
Epoch: 5
Epoch: [5][0/12] lr 0.06247#011Time 1.349 (1.349)#011Data 0.798 (0.798)#011Loss 1.3569 (1.3569)#011Prec 0.344 (0.344)
Epoch: [5][1/12] lr 0.06247#011Time 1.375 (1.362)#011Data 0.827 (0.812)#011Loss 1.2822 (1.3196)#011Prec 0.422 (0.383)
Epoch: [5][2/12] lr 0.06247#011Time 1.349 (1.358)#011Data 0.798 (0.808)#011Loss 1.3867 (1.3419)#011Prec 0.328 (0.365)
Epoch: [5][3/12] lr 0.06247#011Time 1.357 (1.357)#011Data 0.808 (0.808)#011Loss 1.4494 (1.3688)#011Prec 0.297 (0.348)
Epoch: [5][4/12] lr 0.06247#011Time 1.254 (1.337)#011Data 0.705 (0.787)#011Loss 1.4782 (1.3907)#011Prec 0.328 (0.344)
Epoch: [5][5/12] lr 0.06247#011Time 1.288 (1.329)#011Data 0.739 (0.779)#011Loss 1.3279 (1.3802)#011Prec 0.367 (0.348)
Epoch: [5][10/12] lr 0.06247#011Time 1.345 (1.324)#011Data 0.792 (0.774)#011Loss 1.3009 (1.3483)#011Prec 0.414 (0.365)
Epoch: [5][11/12] lr 0.06247#011Time 0.571 (1.261)#011Data 0.323 (0.737)#011Loss 1.1967 (1.3428)#011Prec 0.434 (0.368)
validation: [0/18]#011Time 1.024 (1.024)#011Loss 1.3796 (1.3796)#011Prec 0.289 (0.289)
validation: [1/18]#011Time 1.035 (1.029)#011Loss 1.1989 (1.2892)#011Prec 0.422 (0.355)
validation: [2/18]#011Time 0.962 (1.007)#011Loss 1.4319 (1.3368)#011Prec 0.336 (0.349)
validation: [3/18]#011Time 1.009 (1.007)#011Loss 1.3997 (1.3525)#011Prec 0.328 (0.344)
validation: [4/18]#011Time 0.945 (0.995)#011Loss 1.4531 (1.3726)#011Prec 0.344 (0.344)
validation: [5/18]#011Time 1.128 (1.017)#011Loss 1.4217 (1.3808)#011Prec 0.336 (0.342)
validation: [6/18]#011Time 1.012 (1.016)#011Loss 1.4660 (1.3930)#011Prec 0.281 (0.334)
validation: [7/18]#011Time 1.053 (1.021)#011Loss 1.4197 (1.3963)#011Prec 0.320 (0.332)
validation: [8/18]#011Time 1.036 (1.023)#011Loss 1.3422 (1.3903)#011Prec 0.359 (0.335)
validation: [9/18]#011Time 1.038 (1.024)#011Loss 1.2866 (1.3799)#011Prec 0.391 (0.341)
validation: [10/18]#011Time 1.029 (1.025)#011Loss 1.3573 (1.3779)#011Prec 0.438 (0.349)
validation: [11/18]#011Time 0.992 (1.022)#011Loss 1.2875 (1.3704)#011Prec 0.375 (0.352)
validation: [12/18]#011Time 1.032 (1.023)#011Loss 1.2573 (1.3617)#011Prec 0.484 (0.362)
validation: [13/18]#011Time 0.994 (1.021)#011Loss 1.4011 (1.3645)#011Prec 0.414 (0.366)
validation: [14/18]#011Time 1.147 (1.029)#011Loss 1.4072 (1.3673)#011Prec 0.359 (0.365)
validation: [15/18]#011Time 0.932 (1.023)#011Loss 1.3782 (1.3680)#011Prec 0.383 (0.366)
validation: [16/18]#011Time 0.946 (1.018)#011Loss 1.4090 (1.3704)#011Prec 0.438 (0.370)
validation: [17/18]#011Time 0.189 (0.972)#011Loss 1.0535 (1.3675)#011Prec 0.700 (0.373)
*Validation Precision: 0.373
Epoch: 6
Epoch: [6][0/12] lr 0.06247#011Time 1.292 (1.292)#011Data 0.742 (0.742)#011Loss 1.3600 (1.3600)#011Prec 0.352 (0.352)
Epoch: [6][1/12] lr 0.06247#011Time 1.287 (1.290)#011Data 0.740 (0.741)#011Loss 1.3000 (1.3300)#011Prec 0.414 (0.383)
Epoch: [6][2/12] lr 0.06247#011Time 1.355 (1.311)#011Data 0.805 (0.762)#011Loss 1.3640 (1.3414)#011Prec 0.422 (0.396)
Epoch: [6][3/12] lr 0.06247#011Time 1.374 (1.327)#011Data 0.825 (0.778)#011Loss 1.3408 (1.3412)#011Prec 0.375 (0.391)
Epoch: [6][4/12] lr 0.06247#011Time 1.261 (1.314)#011Data 0.713 (0.765)#011Loss 1.3728 (1.3475)#011Prec 0.352 (0.383)
Epoch: [6][5/12] lr 0.06247#011Time 1.356 (1.321)#011Data 0.807 (0.772)#011Loss 1.3309 (1.3448)#011Prec 0.359 (0.379)
Epoch: [6][6/12] lr 0.06247#011Time 1.377 (1.329)#011Data 0.829 (0.780)#011Loss 1.2878 (1.3366)#011Prec 0.367 (0.377)
Epoch: [6][7/12] lr 0.06247#011Time 1.442 (1.343)#011Data 0.893 (0.794)#011Loss 1.2851 (1.3302)#011Prec 0.438 (0.385)
Epoch: [6][8/12] lr 0.06247#011Time 1.558 (1.367)#011Data 1.006 (0.818)#011Loss 1.3416 (1.3315)#011Prec 0.320 (0.378)
Epoch: [6][9/12] lr 0.06247#011Time 1.337 (1.364)#011Data 0.789 (0.815)#011Loss 1.2454 (1.3229)#011Prec 0.445 (0.384)
Epoch: [6][10/12] lr 0.06247#011Time 1.373 (1.365)#011Data 0.823 (0.816)#011Loss 1.3304 (1.3235)#011Prec 0.406 (0.386)
Epoch: [6][11/12] lr 0.06247#011Time 0.563 (1.298)#011Data 0.318 (0.774)#011Loss 1.1967 (1.3189)#011Prec 0.453 (0.389)
validation: [0/18]#011Time 0.987 (0.987)#011Loss 1.2960 (1.2960)#011Prec 0.328 (0.328)
validation: [1/18]#011Time 1.018 (1.003)#011Loss 1.2859 (1.2910)#011Prec 0.438 (0.383)
validation: [2/18]#011Time 1.015 (1.007)#011Loss 1.3281 (1.3033)#011Prec 0.398 (0.388)
validation: [3/18]#011Time 1.001 (1.006)#011Loss 1.2924 (1.3006)#011Prec 0.312 (0.369)
validation: [4/18]#011Time 1.033 (1.011)#011Loss 1.3141 (1.3033)#011Prec 0.406 (0.377)
validation: [5/18]#011Time 1.044 (1.017)#011Loss 1.3253 (1.3070)#011Prec 0.398 (0.380)
validation: [6/18]#011Time 0.967 (1.010)#011Loss 1.3484 (1.3129)#011Prec 0.320 (0.372)
validation: [7/18]#011Time 1.039 (1.013)#011Loss 1.2723 (1.3078)#011Prec 0.383 (0.373)
validation: [8/18]#011Time 1.025 (1.015)#011Loss 1.3257 (1.3098)#011Prec 0.328 (0.368)
validation: [9/18]#011Time 0.986 (1.012)#011Loss 1.3214 (1.3110)#011Prec 0.336 (0.365)
validation: [10/18]#011Time 0.977 (1.009)#011Loss 1.2730 (1.3075)#011Prec 0.477 (0.375)
validation: [11/18]#011Time 1.073 (1.014)#011Loss 1.2517 (1.3029)#011Prec 0.336 (0.372)
validation: [12/18]#011Time 0.962 (1.010)#011Loss 1.3136 (1.3037)#011Prec 0.375 (0.372)
validation: [13/18]#011Time 1.026 (1.011)#011Loss 1.3518 (1.3071)#011Prec 0.414 (0.375)
validation: [14/18]#011Time 1.066 (1.015)#011Loss 1.3722 (1.3115)#011Prec 0.367 (0.374)
validation: [15/18]#011Time 1.081 (1.019)#011Loss 1.3905 (1.3164)#011Prec 0.328 (0.372)
validation: [16/18]#011Time 1.007 (1.018)#011Loss 1.3128 (1.3162)#011Prec 0.375 (0.372)
validation: [17/18]#011Time 0.166 (0.971)#011Loss 1.1192 (1.3144)#011Prec 0.350 (0.372)
*Validation Precision: 0.372
Epoch: 7
Epoch: [7][0/12] lr 0.06247#011Time 1.274 (1.274)#011Data 0.723 (0.723)#011Loss 1.3556 (1.3556)#011Prec 0.383 (0.383)
Epoch: [7][1/12] lr 0.06247#011Time 1.315 (1.295)#011Data 0.764 (0.744)#011Loss 1.2733 (1.3145)#011Prec 0.430 (0.406)
Epoch: [7][2/12] lr 0.06247#011Time 1.304 (1.298)#011Data 0.753 (0.747)#011Loss 1.3388 (1.3226)#011Prec 0.375 (0.396)
Epoch: [7][3/12] lr 0.06247#011Time 1.273 (1.292)#011Data 0.724 (0.741)#011Loss 1.3764 (1.3360)#011Prec 0.352 (0.385)
Epoch: [7][4/12] lr 0.06247#011Time 1.329 (1.299)#011Data 0.778 (0.748)#011Loss 1.3276 (1.3343)#011Prec 0.414 (0.391)
VanishingGradient: InProgress
Overfit: InProgress
Overtraining: IssuesFound
PoorWeightInitialization: IssuesFound
LossNotDecreasing: InProgress
LowGPUUtilization: IssuesFound
ProfilerReport: InProgress
Epoch: [7][5/12] lr 0.06247#011Time 1.340 (1.306)#011Data 0.791 (0.755)#011Loss 1.3883 (1.3433)#011Prec 0.383 (0.389)
Epoch: [7][6/12] lr 0.06247#011Time 1.394 (1.319)#011Data 0.836 (0.767)#011Loss 1.2816 (1.3345)#011Prec 0.422 (0.394)
Epoch: [7][7/12] lr 0.06247#011Time 1.278 (1.314)#011Data 0.726 (0.762)#011Loss 1.2780 (1.3275)#011Prec 0.406 (0.396)
Epoch: [7][8/12] lr 0.06247#011Time 1.357 (1.318)#011Data 0.807 (0.767)#011Loss 1.3101 (1.3255)#011Prec 0.422 (0.398)
Epoch: [7][9/12] lr 0.06247#011Time 1.361 (1.323)#011Data 0.808 (0.771)#011Loss 1.3235 (1.3253)#011Prec 0.422 (0.401)
Epoch: [7][10/12] lr 0.06247#011Time 1.366 (1.327)#011Data 0.812 (0.775)#011Loss 1.2908 (1.3222)#011Prec 0.352 (0.396)
Epoch: [7][11/12] lr 0.06247#011Time 0.581 (1.264)#011Data 0.335 (0.738)#011Loss 1.4433 (1.3266)#011Prec 0.283 (0.392)
validation: [0/18]#011Time 0.939 (0.939)#011Loss 1.2897 (1.2897)#011Prec 0.430 (0.430)
validation: [1/18]#011Time 1.011 (0.975)#011Loss 1.3628 (1.3263)#011Prec 0.336 (0.383)
validation: [2/18]#011Time 0.986 (0.979)#011Loss 1.3169 (1.3232)#011Prec 0.430 (0.398)
validation: [3/18]#011Time 1.047 (0.996)#011Loss 1.2925 (1.3155)#011Prec 0.430 (0.406)
validation: [4/18]#011Time 1.067 (1.010)#011Loss 1.2329 (1.2990)#011Prec 0.375 (0.400)
validation: [5/18]#011Time 0.979 (1.005)#011Loss 1.2563 (1.2919)#011Prec 0.430 (0.405)
validation: [6/18]#011Time 0.999 (1.004)#011Loss 1.2748 (1.2894)#011Prec 0.367 (0.400)
validation: [7/18]#011Time 1.091 (1.015)#011Loss 1.2763 (1.2878)#011Prec 0.414 (0.401)
validation: [8/18]#011Time 1.098 (1.024)#011Loss 1.3057 (1.2898)#011Prec 0.398 (0.401)
validation: [9/18]#011Time 0.988 (1.021)#011Loss 1.2410 (1.2849)#011Prec 0.445 (0.405)
validation: [10/18]#011Time 0.987 (1.018)#011Loss 1.4221 (1.2974)#011Prec 0.375 (0.403)
validation: [11/18]#011Time 1.044 (1.020)#011Loss 1.1852 (1.2880)#011Prec 0.445 (0.406)
validation: [12/18]#011Time 0.994 (1.018)#011Loss 1.2710 (1.2867)#011Prec 0.398 (0.406)
validation: [13/18]#011Time 1.108 (1.024)#011Loss 1.3085 (1.2883)#011Prec 0.375 (0.403)
validation: [14/18]#011Time 1.032 (1.025)#011Loss 1.3188 (1.2903)#011Prec 0.414 (0.404)
validation: [15/18]#011Time 1.046 (1.026)#011Loss 1.2330 (1.2867)#011Prec 0.375 (0.402)
validation: [16/18]#011Time 0.984 (1.024)#011Loss 1.2859 (1.2867)#011Prec 0.391 (0.402)
validation: [17/18]#011Time 0.162 (0.976)#011Loss 1.1911 (1.2858)#011Prec 0.450 (0.402)
*Validation Precision: 0.402
Epoch: 8
Epoch: [8][0/12] lr 0.06247#011Time 1.299 (1.299)#011Data 0.748 (0.748)#011Loss 1.2348 (1.2348)#011Prec 0.359 (0.359)
Epoch: [8][1/12] lr 0.06247#011Time 1.314 (1.306)#011Data 0.765 (0.756)#011Loss 1.2964 (1.2656)#011Prec 0.359 (0.359)
Epoch: [8][2/12] lr 0.06247#011Time 1.286 (1.300)#011Data 0.733 (0.749)#011Loss 1.3445 (1.2919)#011Prec 0.375 (0.365)
Epoch: [8][3/12] lr 0.06247#011Time 1.443 (1.336)#011Data 0.890 (0.784)#011Loss 1.2202 (1.2740)#011Prec 0.477 (0.393)
Epoch: [8][4/12] lr 0.06247#011Time 1.359 (1.340)#011Data 0.806 (0.788)#011Loss 1.2370 (1.2666)#011Prec 0.438 (0.402)
Epoch: [8][5/12] lr 0.06247#011Time 1.357 (1.343)#011Data 0.804 (0.791)#011Loss 1.2485 (1.2636)#011Prec 0.422 (0.405)
Epoch: [8][6/12] lr 0.06247#011Time 1.412 (1.353)#011Data 0.859 (0.801)#011Loss 1.3793 (1.2801)#011Prec 0.406 (0.405)
Epoch: [8][7/12] lr 0.06247#011Time 1.311 (1.348)#011Data 0.758 (0.795)#011Loss 1.2965 (1.2822)#011Prec 0.359 (0.399)
Epoch: [8][8/12] lr 0.06247#011Time 1.321 (1.345)#011Data 0.768 (0.792)#011Loss 1.3504 (1.2897)#011Prec 0.406 (0.400)
Epoch: [8][9/12] lr 0.06247#011Time 1.441 (1.354)#011Data 0.889 (0.802)#011Loss 1.3112 (1.2919)#011Prec 0.430 (0.403)
Epoch: [8][10/12] lr 0.06247#011Time 1.327 (1.352)#011Data 0.773 (0.799)#011Loss 1.2679 (1.2897)#011Prec 0.422 (0.405)
Epoch: [8][11/12] lr 0.06247#011Time 0.550 (1.285)#011Data 0.303 (0.758)#011Loss 1.3636 (1.2924)#011Prec 0.453 (0.407)
validation: [0/18]#011Time 0.969 (0.969)#011Loss 1.3892 (1.3892)#011Prec 0.336 (0.336)
validation: [1/18]#011Time 0.963 (0.966)#011Loss 1.3532 (1.3712)#011Prec 0.352 (0.344)
validation: [2/18]#011Time 1.023 (0.985)#011Loss 1.3812 (1.3745)#011Prec 0.406 (0.365)
validation: [3/18]#011Time 1.051 (1.002)#011Loss 1.3639 (1.3719)#011Prec 0.359 (0.363)
validation: [4/18]#011Time 0.984 (0.998)#011Loss 1.3798 (1.3735)#011Prec 0.383 (0.367)
validation: [5/18]#011Time 0.959 (0.992)#011Loss 1.3980 (1.3776)#011Prec 0.359 (0.366)
validation: [6/18]#011Time 1.040 (0.998)#011Loss 1.2842 (1.3642)#011Prec 0.367 (0.366)
validation: [7/18]#011Time 1.014 (1.000)#011Loss 1.3189 (1.3586)#011Prec 0.398 (0.370)
validation: [8/18]#011Time 1.019 (1.002)#011Loss 1.3831 (1.3613)#011Prec 0.336 (0.366)
validation: [9/18]#011Time 1.008 (1.003)#011Loss 1.3294 (1.3581)#011Prec 0.414 (0.371)
validation: [10/18]#011Time 1.124 (1.014)#011Loss 1.4683 (1.3681)#011Prec 0.430 (0.376)
validation: [11/18]#011Time 1.083 (1.020)#011Loss 1.4819 (1.3776)#011Prec 0.383 (0.377)
validation: [12/18]#011Time 1.051 (1.022)#011Loss 1.4596 (1.3839)#011Prec 0.312 (0.372)
validation: [13/18]#011Time 0.971 (1.018)#011Loss 1.3384 (1.3807)#011Prec 0.414 (0.375)
validation: [14/18]#011Time 1.055 (1.021)#011Loss 1.3389 (1.3779)#011Prec 0.359 (0.374)
validation: [15/18]#011Time 0.973 (1.018)#011Loss 1.3722 (1.3775)#011Prec 0.367 (0.374)
validation: [16/18]#011Time 1.007 (1.017)#011Loss 1.4211 (1.3801)#011Prec 0.344 (0.372)
validation: [17/18]#011Time 0.176 (0.971)#011Loss 1.5415 (1.3815)#011Prec 0.250 (0.371)
*Validation Precision: 0.371
Epoch: 9
Epoch: [9][0/12] lr 0.06247#011Time 1.273 (1.273)#011Data 0.719 (0.719)#011Loss 1.3546 (1.3546)#011Prec 0.398 (0.398)
Epoch: [9][5/12] lr 0.06247#011Time 1.384 (1.327)#011Data 0.833 (0.774)#011Loss 1.2452 (1.3011)#011Prec 0.430 (0.411)
Epoch: [9][6/12] lr 0.06247#011Time 1.298 (1.323)#011Data 0.745 (0.770)#011Loss 1.2971 (1.3005)#011Prec 0.406 (0.411)
Epoch: [9][7/12] lr 0.06247#011Time 1.341 (1.325)#011Data 0.790 (0.772)#011Loss 1.2801 (1.2979)#011Prec 0.438 (0.414)
Epoch: [9][8/12] lr 0.06247#011Time 1.340 (1.327)#011Data 0.786 (0.774)#011Loss 1.3323 (1.3018)#011Prec 0.383 (0.411)
Epoch: [9][9/12] lr 0.06247#011Time 1.370 (1.331)#011Data 0.818 (0.778)#011Loss 1.5011 (1.3217)#011Prec 0.289 (0.398)
Epoch: [9][10/12] lr 0.06247#011Time 1.356 (1.333)#011Data 0.805 (0.781)#011Loss 1.3446 (1.3238)#011Prec 0.328 (0.392)
Epoch: [9][11/12] lr 0.06247#011Time 0.547 (1.268)#011Data 0.301 (0.741)#011Loss 1.2666 (1.3217)#011Prec 0.358 (0.391)
validation: [0/18]#011Time 0.953 (0.953)#011Loss 1.1691 (1.1691)#011Prec 0.477 (0.477)
validation: [1/18]#011Time 0.991 (0.972)#011Loss 1.2445 (1.2068)#011Prec 0.391 (0.434)
validation: [2/18]#011Time 1.028 (0.991)#011Loss 1.2473 (1.2203)#011Prec 0.414 (0.427)
validation: [3/18]#011Time 0.974 (0.987)#011Loss 1.3180 (1.2447)#011Prec 0.375 (0.414)
validation: [4/18]#011Time 1.044 (0.998)#011Loss 1.3387 (1.2635)#011Prec 0.367 (0.405)
validation: [5/18]#011Time 0.967 (0.993)#011Loss 1.2748 (1.2654)#011Prec 0.406 (0.405)
validation: [6/18]#011Time 1.037 (0.999)#011Loss 1.3269 (1.2742)#011Prec 0.375 (0.401)
validation: [7/18]#011Time 1.018 (1.001)#011Loss 1.2506 (1.2712)#011Prec 0.438 (0.405)
validation: [8/18]#011Time 1.065 (1.009)#011Loss 1.2574 (1.2697)#011Prec 0.359 (0.400)
validation: [9/18]#011Time 1.038 (1.012)#011Loss 1.2334 (1.2660)#011Prec 0.383 (0.398)
validation: [10/18]#011Time 0.956 (1.006)#011Loss 1.2747 (1.2668)#011Prec 0.406 (0.399)
validation: [11/18]#011Time 1.027 (1.008)#011Loss 1.2301 (1.2638)#011Prec 0.398 (0.399)
validation: [12/18]#011Time 1.068 (1.013)#011Loss 1.2595 (1.2634)#011Prec 0.422 (0.401)
validation: [13/18]#011Time 1.089 (1.018)#011Loss 1.3081 (1.2666)#011Prec 0.469 (0.406)
validation: [14/18]#011Time 1.080 (1.022)#011Loss 1.2800 (1.2675)#011Prec 0.438 (0.408)
validation: [15/18]#011Time 1.128 (1.029)#011Loss 1.3294 (1.2714)#011Prec 0.383 (0.406)
validation: [16/18]#011Time 1.142 (1.036)#011Loss 1.2774 (1.2717)#011Prec 0.398 (0.406)
validation: [17/18]#011Time 0.172 (0.988)#011Loss 1.1415 (1.2706)#011Prec 0.400 (0.406)
*Validation Precision: 0.406
Epoch: 10
Epoch: [10][0/12] lr 0.00625#011Time 1.326 (1.326)#011Data 0.769 (0.769)#011Loss 1.3732 (1.3732)#011Prec 0.438 (0.438)
Epoch: [10][1/12] lr 0.00625#011Time 1.345 (1.335)#011Data 0.793 (0.781)#011Loss 1.2444 (1.3088)#011Prec 0.445 (0.441)
Epoch: [10][2/12] lr 0.00625#011Time 1.388 (1.353)#011Data 0.835 (0.799)#011Loss 1.3875 (1.3351)#011Prec 0.359 (0.414)
Epoch: [10][3/12] lr 0.00625#011Time 1.313 (1.343)#011Data 0.760 (0.789)#011Loss 1.3899 (1.3488)#011Prec 0.406 (0.412)
Epoch: [10][4/12] lr 0.00625#011Time 1.320 (1.338)#011Data 0.768 (0.785)#011Loss 1.3751 (1.3540)#011Prec 0.336 (0.397)
Epoch: [10][5/12] lr 0.00625#011Time 1.347 (1.340)#011Data 0.795 (0.787)#011Loss 1.2695 (1.3399)#011Prec 0.398 (0.397)
Epoch: [10][6/12] lr 0.00625#011Time 1.372 (1.344)#011Data 0.821 (0.792)#011Loss 1.3248 (1.3378)#011Prec 0.438 (0.403)
Epoch: [10][7/12] lr 0.00625#011Time 1.287 (1.337)#011Data 0.736 (0.785)#011Loss 1.3172 (1.3352)#011Prec 0.414 (0.404)
Epoch: [10][8/12] lr 0.00625#011Time 1.328 (1.336)#011Data 0.778 (0.784)#011Loss 1.2744 (1.3284)#011Prec 0.406 (0.405)
Epoch: [10][9/12] lr 0.00625#011Time 1.360 (1.339)#011Data 0.807 (0.786)#011Loss 1.2299 (1.3186)#011Prec 0.422 (0.406)
Epoch: [10][10/12] lr 0.00625#011Time 1.352 (1.340)#011Data 0.796 (0.787)#011Loss 1.2486 (1.3122)#011Prec 0.406 (0.406)
Epoch: [10][11/12] lr 0.00625#011Time 0.581 (1.277)#011Data 0.333 (0.749)#011Loss 1.2733 (1.3108)#011Prec 0.396 (0.406)
validation: [0/18]#011Time 0.964 (0.964)#011Loss 1.2136 (1.2136)#011Prec 0.484 (0.484)
validation: [6/18]#011Time 1.006 (1.027)#011Loss 1.1872 (1.2393)#011Prec 0.453 (0.439)
validation: [7/18]#011Time 1.138 (1.040)#011Loss 1.1622 (1.2297)#011Prec 0.445 (0.439)
validation: [8/18]#011Time 1.063 (1.043)#011Loss 1.2729 (1.2345)#011Prec 0.453 (0.441)
validation: [9/18]#011Time 0.936 (1.032)#011Loss 1.1326 (1.2243)#011Prec 0.508 (0.448)
validation: [10/18]#011Time 1.026 (1.032)#011Loss 1.2573 (1.2273)#011Prec 0.430 (0.446)
validation: [11/18]#011Time 0.999 (1.029)#011Loss 1.3282 (1.2357)#011Prec 0.422 (0.444)
validation: [12/18]#011Time 1.055 (1.031)#011Loss 1.1427 (1.2285)#011Prec 0.453 (0.445)
validation: [13/18]#011Time 0.988 (1.028)#011Loss 1.1684 (1.2242)#011Prec 0.422 (0.443)
validation: [14/18]#011Time 1.011 (1.027)#011Loss 1.2642 (1.2269)#011Prec 0.398 (0.440)
validation: [15/18]#011Time 0.955 (1.022)#011Loss 1.2010 (1.2253)#011Prec 0.430 (0.439)
validation: [16/18]#011Time 0.987 (1.020)#011Loss 1.2617 (1.2274)#011Prec 0.406 (0.438)
validation: [17/18]#011Time 0.163 (0.973)#011Loss 1.2423 (1.2276)#011Prec 0.450 (0.438)
*Validation Precision: 0.438
Epoch: 11
Epoch: [11][0/12] lr 0.00625#011Time 1.312 (1.312)#011Data 0.758 (0.758)#011Loss 1.3387 (1.3387)#011Prec 0.391 (0.391)
Epoch: [11][1/12] lr 0.00625#011Time 1.306 (1.309)#011Data 0.751 (0.755)#011Loss 1.4521 (1.3954)#011Prec 0.289 (0.340)
Epoch: [11][6/12] lr 0.00625#011Time 1.326 (1.339)#011Data 0.772 (0.786)#011Loss 1.2579 (1.2964)#011Prec 0.406 (0.402)
Epoch: [11][7/12] lr 0.00625#011Time 1.395 (1.346)#011Data 0.842 (0.793)#011Loss 1.2387 (1.2892)#011Prec 0.508 (0.415)
Epoch: [11][8/12] lr 0.00625#011Time 1.352 (1.347)#011Data 0.798 (0.794)#011Loss 1.2997 (1.2904)#011Prec 0.445 (0.418)
Epoch: [11][9/12] lr 0.00625#011Time 1.292 (1.341)#011Data 0.737 (0.788)#011Loss 1.2308 (1.2844)#011Prec 0.445 (0.421)
Epoch: [11][10/12] lr 0.00625#011Time 1.329 (1.340)#011Data 0.775 (0.787)#011Loss 1.2701 (1.2831)#011Prec 0.445 (0.423)
Epoch: [11][11/12] lr 0.00625#011Time 0.572 (1.276)#011Data 0.324 (0.748)#011Loss 1.2192 (1.2808)#011Prec 0.472 (0.425)
validation: [0/18]#011Time 1.055 (1.055)#011Loss 1.2025 (1.2025)#011Prec 0.469 (0.469)
validation: [1/18]#011Time 1.018 (1.036)#011Loss 1.3324 (1.2674)#011Prec 0.320 (0.395)
validation: [2/18]#011Time 0.986 (1.019)#011Loss 1.2454 (1.2601)#011Prec 0.445 (0.411)
validation: [3/18]#011Time 1.092 (1.038)#011Loss 1.2300 (1.2526)#011Prec 0.367 (0.400)
validation: [4/18]#011Time 1.018 (1.034)#011Loss 1.2460 (1.2513)#011Prec 0.422 (0.405)
validation: [5/18]#011Time 1.097 (1.044)#011Loss 1.2671 (1.2539)#011Prec 0.398 (0.404)
validation: [6/18]#011Time 0.974 (1.034)#011Loss 1.2414 (1.2521)#011Prec 0.406 (0.404)
validation: [7/18]#011Time 1.001 (1.030)#011Loss 1.2839 (1.2561)#011Prec 0.359 (0.398)
validation: [8/18]#011Time 1.136 (1.042)#011Loss 1.3524 (1.2668)#011Prec 0.414 (0.400)
validation: [9/18]#011Time 1.210 (1.059)#011Loss 1.3018 (1.2703)#011Prec 0.414 (0.402)
validation: [10/18]#011Time 0.998 (1.053)#011Loss 1.3184 (1.2747)#011Prec 0.398 (0.401)
validation: [11/18]#011Time 1.042 (1.052)#011Loss 1.3225 (1.2787)#011Prec 0.375 (0.399)
validation: [12/18]#011Time 1.068 (1.053)#011Loss 1.2898 (1.2795)#011Prec 0.422 (0.401)
validation: [13/18]#011Time 1.053 (1.053)#011Loss 1.2900 (1.2803)#011Prec 0.391 (0.400)
validation: [14/18]#011Time 0.988 (1.049)#011Loss 1.2764 (1.2800)#011Prec 0.414 (0.401)
validation: [15/18]#011Time 1.070 (1.050)#011Loss 1.1807 (1.2738)#011Prec 0.438 (0.403)
validation: [16/18]#011Time 0.953 (1.045)#011Loss 1.2970 (1.2752)#011Prec 0.398 (0.403)
validation: [17/18]#011Time 0.160 (0.995)#011Loss 1.4681 (1.2769)#011Prec 0.300 (0.402)
*Validation Precision: 0.402
Epoch: 12
Epoch: [12][0/12] lr 0.00625#011Time 1.308 (1.308)#011Data 0.755 (0.755)#011Loss 1.2969 (1.2969)#011Prec 0.484 (0.484)
Epoch: [12][1/12] lr 0.00625#011Time 1.333 (1.321)#011Data 0.780 (0.768)#011Loss 1.3045 (1.3007)#011Prec 0.391 (0.438)
Epoch: [12][2/12] lr 0.00625#011Time 1.346 (1.329)#011Data 0.792 (0.776)#011Loss 1.2529 (1.2848)#011Prec 0.430 (0.435)
Epoch: [12][3/12] lr 0.00625#011Time 1.282 (1.317)#011Data 0.730 (0.764)#011Loss 1.2456 (1.2750)#011Prec 0.461 (0.441)
Epoch: [12][4/12] lr 0.00625#011Time 1.321 (1.318)#011Data 0.770 (0.765)#011Loss 1.2751 (1.2750)#011Prec 0.367 (0.427)
Epoch: [12][5/12] lr 0.00625#011Time 1.285 (1.313)#011Data 0.731 (0.760)#011Loss 1.2554 (1.2717)#011Prec 0.438 (0.428)
Epoch: [12][6/12] lr 0.00625#011Time 1.303 (1.311)#011Data 0.746 (0.758)#011Loss 1.2776 (1.2726)#011Prec 0.414 (0.426)
Epoch: [12][7/12] lr 0.00625#011Time 1.314 (1.312)#011Data 0.761 (0.758)#011Loss 1.2788 (1.2734)#011Prec 0.438 (0.428)
Epoch: [12][8/12] lr 0.00625#011Time 1.332 (1.314)#011Data 0.780 (0.761)#011Loss 1.1881 (1.2639)#011Prec 0.516 (0.438)
Epoch: [12][9/12] lr 0.00625#011Time 1.359 (1.318)#011Data 0.805 (0.765)#011Loss 1.2616 (1.2637)#011Prec 0.438 (0.438)
Epoch: [12][10/12] lr 0.00625#011Time 1.342 (1.320)#011Data 0.789 (0.767)#011Loss 1.1939 (1.2573)#011Prec 0.500 (0.443)
Epoch: [12][11/12] lr 0.00625#011Time 0.647 (1.264)#011Data 0.400 (0.737)#011Loss 1.2382 (1.2566)#011Prec 0.377 (0.441)
validation: [0/18]#011Time 1.078 (1.078)#011Loss 1.2098 (1.2098)#011Prec 0.430 (0.430)
validation: [1/18]#011Time 1.049 (1.063)#011Loss 1.3599 (1.2848)#011Prec 0.445 (0.438)
validation: [2/18]#011Time 1.132 (1.086)#011Loss 1.2400 (1.2699)#011Prec 0.469 (0.448)
validation: [3/18]#011Time 1.004 (1.066)#011Loss 1.3062 (1.2790)#011Prec 0.359 (0.426)
validation: [4/18]#011Time 1.105 (1.074)#011Loss 1.2528 (1.2737)#011Prec 0.391 (0.419)
validation: [5/18]#011Time 0.959 (1.055)#011Loss 1.2225 (1.2652)#011Prec 0.336 (0.405)
validation: [6/18]#011Time 1.035 (1.052)#011Loss 1.2787 (1.2671)#011Prec 0.320 (0.393)
validation: [7/18]#011Time 1.042 (1.051)#011Loss 1.2390 (1.2636)#011Prec 0.383 (0.392)
validation: [8/18]#011Time 0.977 (1.042)#011Loss 1.2703 (1.2643)#011Prec 0.406 (0.393)
validation: [9/18]#011Time 0.937 (1.032)#011Loss 1.3162 (1.2695)#011Prec 0.359 (0.390)
validation: [10/18]#011Time 1.081 (1.036)#011Loss 1.2558 (1.2683)#011Prec 0.406 (0.391)
validation: [11/18]#011Time 0.953 (1.029)#011Loss 1.3093 (1.2717)#011Prec 0.398 (0.392)
validation: [12/18]#011Time 0.990 (1.026)#011Loss 1.2447 (1.2696)#011Prec 0.391 (0.392)
validation: [13/18]#011Time 1.053 (1.028)#011Loss 1.2887 (1.2710)#011Prec 0.375 (0.391)
validation: [14/18]#011Time 1.069 (1.031)#011Loss 1.2128 (1.2671)#011Prec 0.430 (0.393)
validation: [15/18]#011Time 0.973 (1.027)#011Loss 1.2260 (1.2645)#011Prec 0.438 (0.396)
validation: [16/18]#011Time 0.984 (1.025)#011Loss 1.2719 (1.2650)#011Prec 0.430 (0.398)
validation: [17/18]#011Time 0.174 (0.978)#011Loss 1.1740 (1.2641)#011Prec 0.450 (0.398)
*Validation Precision: 0.398
Epoch: 13
Epoch: [13][0/12] lr 0.00625#011Time 1.333 (1.333)#011Data 0.779 (0.779)#011Loss 1.1634 (1.1634)#011Prec 0.539 (0.539)
Epoch: [13][1/12] lr 0.00625#011Time 1.329 (1.331)#011Data 0.778 (0.778)#011Loss 1.1827 (1.1731)#011Prec 0.516 (0.527)
Epoch: [13][2/12] lr 0.00625#011Time 1.296 (1.319)#011Data 0.743 (0.767)#011Loss 1.3044 (1.2168)#011Prec 0.344 (0.466)
Epoch: [13][3/12] lr 0.00625#011Time 1.339 (1.324)#011Data 0.786 (0.771)#011Loss 1.2605 (1.2277)#011Prec 0.430 (0.457)
Epoch: [13][4/12] lr 0.00625#011Time 1.365 (1.332)#011Data 0.810 (0.779)#011Loss 1.2198 (1.2262)#011Prec 0.484 (0.463)
Epoch: [13][5/12] lr 0.00625#011Time 1.378 (1.340)#011Data 0.825 (0.787)#011Loss 1.2219 (1.2255)#011Prec 0.492 (0.467)
Epoch: [13][6/12] lr 0.00625#011Time 1.304 (1.335)#011Data 0.750 (0.782)#011Loss 1.2336 (1.2266)#011Prec 0.445 (0.464)
Epoch: [13][7/12] lr 0.00625#011Time 1.317 (1.333)#011Data 0.764 (0.779)#011Loss 1.2303 (1.2271)#011Prec 0.445 (0.462)
Epoch: [13][8/12] lr 0.00625#011Time 1.372 (1.337)#011Data 0.816 (0.783)#011Loss 1.3156 (1.2369)#011Prec 0.453 (0.461)
Epoch: [13][9/12] lr 0.00625#011Time 1.306 (1.334)#011Data 0.754 (0.780)#011Loss 1.3005 (1.2433)#011Prec 0.453 (0.460)
Epoch: [13][10/12] lr 0.00625#011Time 1.287 (1.330)#011Data 0.734 (0.776)#011Loss 1.1953 (1.2389)#011Prec 0.414 (0.456)
Epoch: [13][11/12] lr 0.00625#011Time 0.624 (1.271)#011Data 0.377 (0.743)#011Loss 1.3464 (1.2428)#011Prec 0.358 (0.452)
validation: [0/18]#011Time 0.994 (0.994)#011Loss 1.2096 (1.2096)#011Prec 0.469 (0.469)
validation: [1/18]#011Time 0.936 (0.965)#011Loss 1.2804 (1.2450)#011Prec 0.367 (0.418)
validation: [2/18]#011Time 1.002 (0.977)#011Loss 1.2395 (1.2432)#011Prec 0.398 (0.411)
validation: [3/18]#011Time 1.249 (1.045)#011Loss 1.2044 (1.2335)#011Prec 0.461 (0.424)
validation: [4/18]#011Time 1.181 (1.072)#011Loss 1.2923 (1.2452)#011Prec 0.359 (0.411)
validation: [5/18]#011Time 1.006 (1.061)#011Loss 1.2830 (1.2515)#011Prec 0.406 (0.410)
validation: [6/18]#011Time 0.978 (1.049)#011Loss 1.2458 (1.2507)#011Prec 0.398 (0.408)
validation: [7/18]#011Time 1.067 (1.052)#011Loss 1.2487 (1.2505)#011Prec 0.367 (0.403)
validation: [8/18]#011Time 0.967 (1.042)#011Loss 1.2507 (1.2505)#011Prec 0.383 (0.401)
validation: [9/18]#011Time 0.984 (1.036)#011Loss 1.1877 (1.2442)#011Prec 0.461 (0.407)
validation: [10/18]#011Time 1.056 (1.038)#011Loss 1.2118 (1.2413)#011Prec 0.477 (0.413)
validation: [11/18]#011Time 0.927 (1.029)#011Loss 1.2719 (1.2438)#011Prec 0.422 (0.414)
validation: [12/18]#011Time 0.967 (1.024)#011Loss 1.2554 (1.2447)#011Prec 0.469 (0.418)
validation: [13/18]#011Time 1.079 (1.028)#011Loss 1.1515 (1.2381)#011Prec 0.469 (0.422)
validation: [14/18]#011Time 1.025 (1.028)#011Loss 1.3302 (1.2442)#011Prec 0.422 (0.422)
validation: [15/18]#011Time 0.982 (1.025)#011Loss 1.2617 (1.2453)#011Prec 0.383 (0.419)
validation: [16/18]#011Time 0.960 (1.021)#011Loss 1.1218 (1.2380)#011Prec 0.398 (0.418)
validation: [17/18]#011Time 0.191 (0.975)#011Loss 1.0007 (1.2359)#011Prec 0.550 (0.419)
*Validation Precision: 0.419
Epoch: 14
Epoch: [14][0/12] lr 0.00625#011Time 1.304 (1.304)#011Data 0.751 (0.751)#011Loss 1.2968 (1.2968)#011Prec 0.383 (0.383)
Epoch: [14][1/12] lr 0.00625#011Time 1.317 (1.311)#011Data 0.762 (0.756)#011Loss 1.1753 (1.2361)#011Prec 0.523 (0.453)
Epoch: [14][2/12] lr 0.00625#011Time 1.323 (1.315)#011Data 0.770 (0.761)#011Loss 1.2179 (1.2300)#011Prec 0.383 (0.430)
Epoch: [14][3/12] lr 0.00625#011Time 1.355 (1.325)#011Data 0.803 (0.771)#011Loss 1.2308 (1.2302)#011Prec 0.461 (0.438)
Epoch: [14][4/12] lr 0.00625#011Time 1.346 (1.329)#011Data 0.795 (0.776)#011Loss 1.2646 (1.2371)#011Prec 0.422 (0.434)
Epoch: [14][5/12] lr 0.00625#011Time 1.329 (1.329)#011Data 0.778 (0.776)#011Loss 1.2169 (1.2337)#011Prec 0.430 (0.434)
Epoch: [14][6/12] lr 0.00625#011Time 1.302 (1.325)#011Data 0.748 (0.772)#011Loss 1.2243 (1.2324)#011Prec 0.453 (0.436)
Epoch: [14][7/12] lr 0.00625#011Time 1.361 (1.330)#011Data 0.805 (0.776)#011Loss 1.1873 (1.2267)#011Prec 0.508 (0.445)
Epoch: [14][8/12] lr 0.00625#011Time 1.320 (1.329)#011Data 0.768 (0.776)#011Loss 1.2733 (1.2319)#011Prec 0.383 (0.438)
Epoch: [14][9/12] lr 0.00625#011Time 1.326 (1.328)#011Data 0.775 (0.775)#011Loss 1.3203 (1.2408)#011Prec 0.445 (0.439)
Epoch: [14][10/12] lr 0.00625#011Time 1.381 (1.333)#011Data 0.827 (0.780)#011Loss 1.1771 (1.2350)#011Prec 0.461 (0.441)
Epoch: [14][11/12] lr 0.00625#011Time 0.531 (1.266)#011Data 0.285 (0.739)#011Loss 1.3257 (1.2383)#011Prec 0.434 (0.441)
validation: [0/18]#011Time 0.971 (0.971)#011Loss 1.3063 (1.3063)#011Prec 0.406 (0.406)
validation: [1/18]#011Time 1.034 (1.003)#011Loss 1.2204 (1.2633)#011Prec 0.438 (0.422)
validation: [2/18]#011Time 1.006 (1.004)#011Loss 1.2430 (1.2566)#011Prec 0.453 (0.432)
validation: [3/18]#011Time 1.037 (1.012)#011Loss 1.2906 (1.2651)#011Prec 0.391 (0.422)
validation: [4/18]#011Time 1.009 (1.011)#011Loss 1.2656 (1.2652)#011Prec 0.438 (0.425)
validation: [5/18]#011Time 0.968 (1.004)#011Loss 1.1605 (1.2477)#011Prec 0.477 (0.434)
validation: [6/18]#011Time 1.133 (1.023)#011Loss 1.1913 (1.2397)#011Prec 0.461 (0.438)
validation: [7/18]#011Time 1.002 (1.020)#011Loss 1.3311 (1.2511)#011Prec 0.336 (0.425)
validation: [8/18]#011Time 1.098 (1.029)#011Loss 1.3009 (1.2566)#011Prec 0.328 (0.414)
validation: [9/18]#011Time 0.966 (1.022)#011Loss 1.2724 (1.2582)#011Prec 0.367 (0.409)
validation: [10/18]#011Time 0.969 (1.018)#011Loss 1.1481 (1.2482)#011Prec 0.508 (0.418)
validation: [11/18]#011Time 1.024 (1.018)#011Loss 1.0836 (1.2345)#011Prec 0.484 (0.424)
validation: [12/18]#011Time 0.998 (1.017)#011Loss 1.2490 (1.2356)#011Prec 0.430 (0.424)
validation: [13/18]#011Time 0.977 (1.014)#011Loss 1.2628 (1.2375)#011Prec 0.391 (0.422)
validation: [14/18]#011Time 1.057 (1.017)#011Loss 1.1479 (1.2316)#011Prec 0.500 (0.427)
validation: [15/18]#011Time 0.990 (1.015)#011Loss 1.2482 (1.2326)#011Prec 0.391 (0.425)
validation: [16/18]#011Time 1.031 (1.016)#011Loss 1.2419 (1.2332)#011Prec 0.391 (0.423)
validation: [17/18]#011Time 0.161 (0.968)#011Loss 1.1687 (1.2326)#011Prec 0.550 (0.424)
*Validation Precision: 0.424
Epoch: 15
Epoch: [15][0/12] lr 0.00625#011Time 1.393 (1.393)#011Data 0.837 (0.837)#011Loss 1.1890 (1.1890)#011Prec 0.422 (0.422)
Epoch: [15][1/12] lr 0.00625#011Time 1.309 (1.351)#011Data 0.757 (0.797)#011Loss 1.2164 (1.2027)#011Prec 0.453 (0.438)
Epoch: [15][2/12] lr 0.00625#011Time 1.335 (1.346)#011Data 0.781 (0.792)#011Loss 1.2168 (1.2074)#011Prec 0.430 (0.435)
Epoch: [15][3/12] lr 0.00625#011Time 1.366 (1.351)#011Data 0.815 (0.797)#011Loss 1.2543 (1.2192)#011Prec 0.398 (0.426)
Epoch: [15][4/12] lr 0.00625#011Time 1.308 (1.342)#011Data 0.755 (0.789)#011Loss 1.2522 (1.2258)#011Prec 0.445 (0.430)
Epoch: [15][5/12] lr 0.00625#011Time 1.342 (1.342)#011Data 0.790 (0.789)#011Loss 1.2447 (1.2289)#011Prec 0.414 (0.427)
Epoch: [15][6/12] lr 0.00625#011Time 1.331 (1.341)#011Data 0.779 (0.788)#011Loss 1.2407 (1.2306)#011Prec 0.484 (0.435)
Epoch: [15][7/12] lr 0.00625#011Time 1.375 (1.345)#011Data 0.824 (0.792)#011Loss 1.1629 (1.2221)#011Prec 0.516 (0.445)
Epoch: [15][8/12] lr 0.00625#011Time 1.333 (1.344)#011Data 0.778 (0.791)#011Loss 1.2294 (1.2229)#011Prec 0.430 (0.444)
Epoch: [15][9/12] lr 0.00625#011Time 1.380 (1.347)#011Data 0.826 (0.794)#011Loss 1.2002 (1.2207)#011Prec 0.500 (0.449)
Epoch: [15][10/12] lr 0.00625#011Time 1.373 (1.350)#011Data 0.817 (0.796)#011Loss 1.2262 (1.2212)#011Prec 0.383 (0.443)
Epoch: [15][11/12] lr 0.00625#011Time 0.578 (1.285)#011Data 0.330 (0.758)#011Loss 1.2879 (1.2236)#011Prec 0.340 (0.439)
validation: [0/18]#011Time 1.057 (1.057)#011Loss 1.2151 (1.2151)#011Prec 0.469 (0.469)
validation: [1/18]#011Time 0.975 (1.016)#011Loss 1.1678 (1.1915)#011Prec 0.477 (0.473)
validation: [2/18]#011Time 1.112 (1.048)#011Loss 1.1972 (1.1934)#011Prec 0.391 (0.445)
validation: [3/18]#011Time 0.993 (1.034)#011Loss 1.2829 (1.2158)#011Prec 0.359 (0.424)
validation: [4/18]#011Time 1.047 (1.037)#011Loss 1.3074 (1.2341)#011Prec 0.305 (0.400)
validation: [5/18]#011Time 0.970 (1.026)#011Loss 1.2335 (1.2340)#011Prec 0.406 (0.401)
validation: [6/18]#011Time 1.068 (1.032)#011Loss 1.1819 (1.2266)#011Prec 0.469 (0.411)
validation: [7/18]#011Time 0.966 (1.023)#011Loss 1.2146 (1.2251)#011Prec 0.461 (0.417)
validation: [8/18]#011Time 1.089 (1.031)#011Loss 1.2363 (1.2263)#011Prec 0.414 (0.417)
validation: [9/18]#011Time 1.019 (1.030)#011Loss 1.1715 (1.2208)#011Prec 0.445 (0.420)
validation: [10/18]#011Time 1.082 (1.034)#011Loss 1.1070 (1.2105)#011Prec 0.453 (0.423)
validation: [11/18]#011Time 1.003 (1.032)#011Loss 1.1409 (1.2047)#011Prec 0.477 (0.427)
validation: [12/18]#011Time 1.120 (1.039)#011Loss 1.1580 (1.2011)#011Prec 0.500 (0.433)
validation: [13/18]#011Time 1.006 (1.036)#011Loss 1.3803 (1.2139)#011Prec 0.414 (0.431)
validation: [14/18]#011Time 1.004 (1.034)#011Loss 1.2359 (1.2154)#011Prec 0.438 (0.432)
validation: [15/18]#011Time 1.028 (1.034)#011Loss 1.2124 (1.2152)#011Prec 0.500 (0.436)
validation: [16/18]#011Time 0.962 (1.029)#011Loss 1.2571 (1.2176)#011Prec 0.453 (0.437)
validation: [17/18]#011Time 0.152 (0.981)#011Loss 1.1830 (1.2173)#011Prec 0.400 (0.437)
*Validation Precision: 0.437
Epoch: 16
Epoch: [16][0/12] lr 0.00625#011Time 1.371 (1.371)#011Data 0.820 (0.820)#011Loss 1.1885 (1.1885)#011Prec 0.477 (0.477)
Epoch: [16][1/12] lr 0.00625#011Time 1.385 (1.378)#011Data 0.833 (0.826)#011Loss 1.2199 (1.2042)#011Prec 0.422 (0.449)
Epoch: [16][2/12] lr 0.00625#011Time 1.321 (1.359)#011Data 0.768 (0.807)#011Loss 1.2041 (1.2042)#011Prec 0.438 (0.445)
Epoch: [16][3/12] lr 0.00625#011Time 1.363 (1.360)#011Data 0.810 (0.808)#011Loss 1.2497 (1.2156)#011Prec 0.453 (0.447)
Epoch: [16][4/12] lr 0.00625#011Time 1.322 (1.352)#011Data 0.768 (0.800)#011Loss 1.1965 (1.2118)#011Prec 0.508 (0.459)
Epoch: [16][5/12] lr 0.00625#011Time 1.315 (1.346)#011Data 0.763 (0.794)#011Loss 1.2326 (1.2152)#011Prec 0.484 (0.464)
Epoch: [16][6/12] lr 0.00625#011Time 1.322 (1.343)#011Data 0.769 (0.790)#011Loss 1.3339 (1.2322)#011Prec 0.367 (0.450)
Epoch: [16][7/12] lr 0.00625#011Time 1.324 (1.340)#011Data 0.771 (0.788)#011Loss 1.2008 (1.2283)#011Prec 0.438 (0.448)
Epoch: [16][8/12] lr 0.00625#011Time 1.301 (1.336)#011Data 0.745 (0.783)#011Loss 1.2306 (1.2285)#011Prec 0.477 (0.451)
Epoch: [16][9/12] lr 0.00625#011Time 1.312 (1.334)#011Data 0.758 (0.781)#011Loss 1.1324 (1.2189)#011Prec 0.477 (0.454)
Epoch: [16][10/12] lr 0.00625#011Time 1.350 (1.335)#011Data 0.798 (0.782)#011Loss 1.1557 (1.2132)#011Prec 0.500 (0.458)
Epoch: [16][11/12] lr 0.00625#011Time 0.557 (1.270)#011Data 0.310 (0.743)#011Loss 1.0745 (1.2081)#011Prec 0.547 (0.461)
validation: [0/18]#011Time 1.004 (1.004)#011Loss 1.1677 (1.1677)#011Prec 0.484 (0.484)
validation: [1/18]#011Time 0.948 (0.976)#011Loss 1.2363 (1.2020)#011Prec 0.422 (0.453)
validation: [2/18]#011Time 0.936 (0.963)#011Loss 1.1570 (1.1870)#011Prec 0.469 (0.458)
validation: [3/18]#011Time 1.033 (0.980)#011Loss 1.2595 (1.2051)#011Prec 0.398 (0.443)
validation: [4/18]#011Time 1.000 (0.984)#011Loss 1.2246 (1.2090)#011Prec 0.430 (0.441)
validation: [5/18]#011Time 1.085 (1.001)#011Loss 1.2377 (1.2138)#011Prec 0.477 (0.447)
validation: [6/18]#011Time 1.051 (1.008)#011Loss 1.1672 (1.2072)#011Prec 0.453 (0.448)
validation: [7/18]#011Time 1.084 (1.018)#011Loss 1.1874 (1.2047)#011Prec 0.422 (0.444)
validation: [8/18]#011Time 1.017 (1.018)#011Loss 1.2254 (1.2070)#011Prec 0.453 (0.445)
validation: [9/18]#011Time 1.122 (1.028)#011Loss 1.2305 (1.2093)#011Prec 0.367 (0.438)
validation: [10/18]#011Time 0.993 (1.025)#011Loss 1.1897 (1.2076)#011Prec 0.438 (0.438)
validation: [11/18]#011Time 0.973 (1.021)#011Loss 1.2842 (1.2139)#011Prec 0.438 (0.438)
validation: [12/18]#011Time 0.958 (1.016)#011Loss 1.2021 (1.2130)#011Prec 0.445 (0.438)
validation: [13/18]#011Time 1.002 (1.015)#011Loss 1.2813 (1.2179)#011Prec 0.438 (0.438)
validation: [14/18]#011Time 1.111 (1.021)#011Loss 1.2195 (1.2180)#011Prec 0.438 (0.438)
validation: [15/18]#011Time 1.023 (1.021)#011Loss 1.2044 (1.2172)#011Prec 0.383 (0.435)
validation: [16/18]#011Time 0.970 (1.018)#011Loss 1.1862 (1.2153)#011Prec 0.484 (0.438)
validation: [17/18]#011Time 0.150 (0.970)#011Loss 1.0342 (1.2137)#011Prec 0.500 (0.438)
*Validation Precision: 0.438
Epoch: 17
Epoch: [17][0/12] lr 0.00625#011Time 1.329 (1.329)#011Data 0.776 (0.776)#011Loss 1.1874 (1.1874)#011Prec 0.438 (0.438)
Epoch: [17][1/12] lr 0.00625#011Time 1.423 (1.376)#011Data 0.872 (0.824)#011Loss 1.2906 (1.2390)#011Prec 0.406 (0.422)
Epoch: [17][2/12] lr 0.00625#011Time 1.370 (1.374)#011Data 0.815 (0.821)#011Loss 1.2692 (1.2491)#011Prec 0.391 (0.411)
Epoch: [17][3/12] lr 0.00625#011Time 1.282 (1.351)#011Data 0.730 (0.798)#011Loss 1.1729 (1.2300)#011Prec 0.469 (0.426)
Epoch: [17][4/12] lr 0.00625#011Time 1.377 (1.356)#011Data 0.823 (0.803)#011Loss 1.1907 (1.2222)#011Prec 0.484 (0.438)
Epoch: [17][5/12] lr 0.00625#011Time 1.363 (1.357)#011Data 0.812 (0.805)#011Loss 1.2181 (1.2215)#011Prec 0.469 (0.443)
Epoch: [17][6/12] lr 0.00625#011Time 1.352 (1.356)#011Data 0.795 (0.803)#011Loss 1.1725 (1.2145)#011Prec 0.461 (0.445)
Epoch: [17][7/12] lr 0.00625#011Time 1.339 (1.354)#011Data 0.786 (0.801)#011Loss 1.3448 (1.2308)#011Prec 0.398 (0.439)
Epoch: [17][8/12] lr 0.00625#011Time 1.331 (1.352)#011Data 0.779 (0.799)#011Loss 1.1882 (1.2260)#011Prec 0.500 (0.446)
Epoch: [17][9/12] lr 0.00625#011Time 1.275 (1.344)#011Data 0.723 (0.791)#011Loss 1.2639 (1.2298)#011Prec 0.461 (0.448)
Epoch: [17][10/12] lr 0.00625#011Time 1.325 (1.342)#011Data 0.771 (0.789)#011Loss 1.2229 (1.2292)#011Prec 0.461 (0.449)
Epoch: [17][11/12] lr 0.00625#011Time 0.522 (1.274)#011Data 0.273 (0.746)#011Loss 1.4058 (1.2356)#011Prec 0.358 (0.446)
validation: [0/18]#011Time 1.023 (1.023)#011Loss 1.2009 (1.2009)#011Prec 0.445 (0.445)
validation: [1/18]#011Time 1.154 (1.089)#011Loss 1.1921 (1.1965)#011Prec 0.461 (0.453)
validation: [2/18]#011Time 0.981 (1.053)#011Loss 1.2200 (1.2043)#011Prec 0.453 (0.453)
validation: [3/18]#011Time 1.082 (1.060)#011Loss 1.1970 (1.2025)#011Prec 0.500 (0.465)
validation: [4/18]#011Time 0.991 (1.046)#011Loss 1.2004 (1.2021)#011Prec 0.398 (0.452)
validation: [5/18]#011Time 1.042 (1.046)#011Loss 1.2092 (1.2033)#011Prec 0.414 (0.445)
validation: [6/18]#011Time 1.031 (1.044)#011Loss 1.1787 (1.1998)#011Prec 0.430 (0.443)
validation: [7/18]#011Time 0.955 (1.033)#011Loss 1.2904 (1.2111)#011Prec 0.414 (0.439)
validation: [8/18]#011Time 1.053 (1.035)#011Loss 1.2060 (1.2105)#011Prec 0.430 (0.438)
validation: [9/18]#011Time 1.032 (1.035)#011Loss 1.2066 (1.2101)#011Prec 0.461 (0.441)
validation: [10/18]#011Time 1.047 (1.036)#011Loss 1.1703 (1.2065)#011Prec 0.469 (0.443)
validation: [11/18]#011Time 1.010 (1.034)#011Loss 1.1925 (1.2053)#011Prec 0.422 (0.441)
validation: [12/18]#011Time 1.030 (1.033)#011Loss 1.3067 (1.2131)#011Prec 0.477 (0.444)
validation: [13/18]#011Time 0.967 (1.029)#011Loss 1.2469 (1.2156)#011Prec 0.391 (0.440)
validation: [14/18]#011Time 0.990 (1.026)#011Loss 1.1219 (1.2093)#011Prec 0.500 (0.444)
validation: [15/18]#011Time 1.038 (1.027)#011Loss 1.2353 (1.2109)#011Prec 0.375 (0.440)
validation: [16/18]#011Time 1.029 (1.027)#011Loss 1.1735 (1.2087)#011Prec 0.484 (0.443)
validation: [17/18]#011Time 0.168 (0.979)#011Loss 1.4359 (1.2108)#011Prec 0.400 (0.442)
*Validation Precision: 0.442
Epoch: 18
Epoch: [18][0/12] lr 0.00625#011Time 1.361 (1.361)#011Data 0.808 (0.808)#011Loss 1.1874 (1.1874)#011Prec 0.430 (0.430)
Epoch: [18][1/12] lr 0.00625#011Time 1.397 (1.379)#011Data 0.844 (0.826)#011Loss 1.2742 (1.2308)#011Prec 0.445 (0.438)
Epoch: [18][2/12] lr 0.00625#011Time 1.336 (1.365)#011Data 0.784 (0.812)#011Loss 1.1781 (1.2132)#011Prec 0.477 (0.451)
Epoch: [18][3/12] lr 0.00625#011Time 1.305 (1.350)#011Data 0.751 (0.797)#011Loss 1.2602 (1.2250)#011Prec 0.445 (0.449)
Epoch: [18][4/12] lr 0.00625#011Time 1.326 (1.345)#011Data 0.770 (0.791)#011Loss 1.1866 (1.2173)#011Prec 0.430 (0.445)
Epoch: [18][5/12] lr 0.00625#011Time 1.341 (1.345)#011Data 0.788 (0.791)#011Loss 1.2446 (1.2218)#011Prec 0.438 (0.444)
Epoch: [18][6/12] lr 0.00625#011Time 1.312 (1.340)#011Data 0.758 (0.786)#011Loss 1.1543 (1.2122)#011Prec 0.484 (0.450)
Epoch: [18][7/12] lr 0.00625#011Time 1.355 (1.342)#011Data 0.801 (0.788)#011Loss 1.1271 (1.2015)#011Prec 0.508 (0.457)
Epoch: [18][8/12] lr 0.00625#011Time 1.426 (1.351)#011Data 0.873 (0.797)#011Loss 1.2127 (1.2028)#011Prec 0.453 (0.457)
Epoch: [18][9/12] lr 0.00625#011Time 1.314 (1.347)#011Data 0.764 (0.794)#011Loss 1.1975 (1.2023)#011Prec 0.453 (0.456)
Epoch: [18][10/12] lr 0.00625#011Time 1.341 (1.347)#011Data 0.784 (0.793)#011Loss 1.1753 (1.1998)#011Prec 0.484 (0.459)
Epoch: [18][11/12] lr 0.00625#011Time 0.602 (1.285)#011Data 0.353 (0.757)#011Loss 1.2778 (1.2026)#011Prec 0.453 (0.459)
validation: [0/18]#011Time 1.033 (1.033)#011Loss 1.2657 (1.2657)#011Prec 0.391 (0.391)
validation: [1/18]#011Time 0.961 (0.997)#011Loss 1.1753 (1.2205)#011Prec 0.516 (0.453)
validation: [2/18]#011Time 1.020 (1.005)#011Loss 1.1971 (1.2127)#011Prec 0.391 (0.432)
validation: [3/18]#011Time 0.953 (0.992)#011Loss 1.2531 (1.2228)#011Prec 0.430 (0.432)
validation: [4/18]#011Time 1.018 (0.997)#011Loss 1.2278 (1.2238)#011Prec 0.391 (0.423)
validation: [5/18]#011Time 1.001 (0.998)#011Loss 1.1366 (1.2093)#011Prec 0.508 (0.438)
validation: [6/18]#011Time 1.073 (1.008)#011Loss 1.2632 (1.2170)#011Prec 0.484 (0.444)
validation: [7/18]#011Time 1.081 (1.018)#011Loss 1.1414 (1.2075)#011Prec 0.492 (0.450)
validation: [8/18]#011Time 1.086 (1.025)#011Loss 1.2318 (1.2102)#011Prec 0.438 (0.449)
validation: [9/18]#011Time 0.987 (1.021)#011Loss 1.0808 (1.1973)#011Prec 0.461 (0.450)
validation: [10/18]#011Time 0.983 (1.018)#011Loss 1.0798 (1.1866)#011Prec 0.492 (0.454)
validation: [11/18]#011Time 1.064 (1.022)#011Loss 1.2432 (1.1913)#011Prec 0.469 (0.455)
validation: [12/18]#011Time 1.025 (1.022)#011Loss 1.1997 (1.1920)#011Prec 0.430 (0.453)
validation: [13/18]#011Time 1.023 (1.022)#011Loss 1.2170 (1.1938)#011Prec 0.445 (0.453)
validation: [14/18]#011Time 1.038 (1.023)#011Loss 1.2461 (1.1973)#011Prec 0.422 (0.451)
validation: [15/18]#011Time 1.009 (1.022)#011Loss 1.0608 (1.1887)#011Prec 0.570 (0.458)
validation: [16/18]#011Time 0.976 (1.019)#011Loss 1.1975 (1.1892)#011Prec 0.453 (0.458)
validation: [17/18]#011Time 0.175 (0.973)#011Loss 1.1116 (1.1885)#011Prec 0.500 (0.458)
*Validation Precision: 0.458
Epoch: 19
Epoch: [19][0/12] lr 0.00625#011Time 1.321 (1.321)#011Data 0.765 (0.765)#011Loss 1.2512 (1.2512)#011Prec 0.461 (0.461)
Epoch: [19][1/12] lr 0.00625#011Time 1.305 (1.313)#011Data 0.750 (0.758)#011Loss 1.0968 (1.1740)#011Prec 0.484 (0.473)
Epoch: [19][2/12] lr 0.00625#011Time 1.408 (1.345)#011Data 0.850 (0.789)#011Loss 1.1770 (1.1750)#011Prec 0.508 (0.484)
Epoch: [19][3/12] lr 0.00625#011Time 1.403 (1.359)#011Data 0.847 (0.803)#011Loss 1.2143 (1.1848)#011Prec 0.445 (0.475)
Epoch: [19][4/12] lr 0.00625#011Time 1.399 (1.367)#011Data 0.843 (0.811)#011Loss 1.1791 (1.1837)#011Prec 0.508 (0.481)
Epoch: [19][5/12] lr 0.00625#011Time 1.354 (1.365)#011Data 0.797 (0.809)#011Loss 1.1311 (1.1749)#011Prec 0.453 (0.477)
Epoch: [19][6/12] lr 0.00625#011Time 1.386 (1.368)#011Data 0.831 (0.812)#011Loss 1.1577 (1.1725)#011Prec 0.484 (0.478)
Epoch: [19][7/12] lr 0.00625#011Time 1.382 (1.370)#011Data 0.827 (0.814)#011Loss 1.1925 (1.1750)#011Prec 0.430 (0.472)
Epoch: [19][8/12] lr 0.00625#011Time 1.307 (1.363)#011Data 0.750 (0.807)#011Loss 1.3105 (1.1900)#011Prec 0.414 (0.465)
Epoch: [19][9/12] lr 0.00625#011Time 1.280 (1.355)#011Data 0.724 (0.798)#011Loss 1.3298 (1.2040)#011Prec 0.398 (0.459)
Epoch: [19][10/12] lr 0.00625#011Time 1.454 (1.364)#011Data 0.897 (0.807)#011Loss 1.2600 (1.2091)#011Prec 0.500 (0.462)
Epoch: [19][11/12] lr 0.00625#011Time 0.549 (1.296)#011Data 0.302 (0.765)#011Loss 1.2253 (1.2097)#011Prec 0.585 (0.467)
validation: [0/18]#011Time 1.036 (1.036)#011Loss 1.2343 (1.2343)#011Prec 0.461 (0.461)
validation: [1/18]#011Time 1.060 (1.048)#011Loss 1.1721 (1.2032)#011Prec 0.477 (0.469)
validation: [2/18]#011Time 1.000 (1.032)#011Loss 1.1917 (1.1994)#011Prec 0.492 (0.477)
validation: [3/18]#011Time 0.997 (1.023)#011Loss 1.1130 (1.1778)#011Prec 0.516 (0.486)
validation: [4/18]#011Time 1.115 (1.042)#011Loss 1.1581 (1.1739)#011Prec 0.438 (0.477)
validation: [5/18]#011Time 1.181 (1.065)#011Loss 1.2559 (1.1875)#011Prec 0.406 (0.465)
validation: [6/18]#011Time 1.036 (1.061)#011Loss 1.3045 (1.2042)#011Prec 0.359 (0.450)
validation: [7/18]#011Time 0.982 (1.051)#011Loss 1.2151 (1.2056)#011Prec 0.438 (0.448)
validation: [8/18]#011Time 0.962 (1.041)#011Loss 1.1983 (1.2048)#011Prec 0.445 (0.448)
validation: [9/18]#011Time 1.013 (1.038)#011Loss 1.2051 (1.2048)#011Prec 0.469 (0.450)
validation: [10/18]#011Time 1.003 (1.035)#011Loss 1.3003 (1.2135)#011Prec 0.359 (0.442)
validation: [11/18]#011Time 1.202 (1.049)#011Loss 1.2638 (1.2177)#011Prec 0.391 (0.438)
validation: [12/18]#011Time 0.976 (1.043)#011Loss 1.2218 (1.2180)#011Prec 0.453 (0.439)
validation: [13/18]#011Time 1.025 (1.042)#011Loss 1.1586 (1.2138)#011Prec 0.508 (0.444)
validation: [14/18]#011Time 1.006 (1.039)#011Loss 1.1045 (1.2065)#011Prec 0.492 (0.447)
validation: [15/18]#011Time 0.993 (1.037)#011Loss 1.2810 (1.2111)#011Prec 0.422 (0.445)
validation: [16/18]#011Time 1.041 (1.037)#011Loss 1.2055 (1.2108)#011Prec 0.445 (0.445)
validation: [17/18]#011Time 0.159 (0.988)#011Loss 1.3257 (1.2118)#011Prec 0.450 (0.445)
*Validation Precision: 0.445
Epoch: 20
Epoch: [20][0/12] lr 0.00062#011Time 1.310 (1.310)#011Data 0.755 (0.755)#011Loss 1.1390 (1.1390)#011Prec 0.438 (0.438)
Epoch: [20][1/12] lr 0.00062#011Time 1.377 (1.344)#011Data 0.824 (0.790)#011Loss 1.2000 (1.1695)#011Prec 0.469 (0.453)
Epoch: [20][2/12] lr 0.00062#011Time 1.331 (1.339)#011Data 0.779 (0.786)#011Loss 1.1785 (1.1725)#011Prec 0.438 (0.448)
Epoch: [20][3/12] lr 0.00062#011Time 1.351 (1.342)#011Data 0.799 (0.790)#011Loss 1.1785 (1.1740)#011Prec 0.398 (0.436)
Epoch: [20][4/12] lr 0.00062#011Time 1.294 (1.333)#011Data 0.742 (0.780)#011Loss 1.2680 (1.1928)#011Prec 0.375 (0.423)
Epoch: [20][5/12] lr 0.00062#011Time 1.302 (1.327)#011Data 0.751 (0.775)#011Loss 1.1601 (1.1873)#011Prec 0.477 (0.432)
Epoch: [20][6/12] lr 0.00062#011Time 1.312 (1.325)#011Data 0.761 (0.773)#011Loss 1.2288 (1.1933)#011Prec 0.477 (0.439)
Epoch: [20][7/12] lr 0.00062#011Time 1.331 (1.326)#011Data 0.780 (0.774)#011Loss 1.1630 (1.1895)#011Prec 0.477 (0.443)
Epoch: [20][8/12] lr 0.00062#011Time 1.332 (1.327)#011Data 0.779 (0.775)#011Loss 1.2240 (1.1933)#011Prec 0.508 (0.451)
Epoch: [20][9/12] lr 0.00062#011Time 1.384 (1.332)#011Data 0.831 (0.780)#011Loss 1.1887 (1.1929)#011Prec 0.484 (0.454)
Epoch: [20][10/12] lr 0.00062#011Time 1.372 (1.336)#011Data 0.818 (0.784)#011Loss 1.2263 (1.1959)#011Prec 0.477 (0.456)
Epoch: [20][11/12] lr 0.00062#011Time 0.563 (1.272)#011Data 0.315 (0.745)#011Loss 1.2178 (1.1967)#011Prec 0.434 (0.455)
validation: [0/18]#011Time 1.101 (1.101)#011Loss 1.2005 (1.2005)#011Prec 0.492 (0.492)
validation: [1/18]#011Time 1.000 (1.050)#011Loss 1.1982 (1.1994)#011Prec 0.453 (0.473)
validation: [2/18]#011Time 1.036 (1.046)#011Loss 1.2948 (1.2312)#011Prec 0.469 (0.471)
validation: [3/18]#011Time 1.007 (1.036)#011Loss 1.1672 (1.2152)#011Prec 0.484 (0.475)
validation: [4/18]#011Time 0.954 (1.019)#011Loss 1.2060 (1.2134)#011Prec 0.445 (0.469)
validation: [5/18]#011Time 1.040 (1.023)#011Loss 1.1821 (1.2081)#011Prec 0.352 (0.449)
validation: [6/18]#011Time 1.005 (1.020)#011Loss 1.1601 (1.2013)#011Prec 0.547 (0.463)
validation: [7/18]#011Time 1.042 (1.023)#011Loss 1.1739 (1.1979)#011Prec 0.445 (0.461)
validation: [8/18]#011Time 1.031 (1.024)#011Loss 1.1732 (1.1951)#011Prec 0.445 (0.459)
validation: [9/18]#011Time 1.068 (1.028)#011Loss 1.1935 (1.1950)#011Prec 0.445 (0.458)
validation: [10/18]#011Time 1.031 (1.029)#011Loss 1.2167 (1.1969)#011Prec 0.461 (0.458)
validation: [11/18]#011Time 1.013 (1.027)#011Loss 1.1549 (1.1934)#011Prec 0.422 (0.455)
validation: [12/18]#011Time 0.914 (1.019)#011Loss 1.2769 (1.1999)#011Prec 0.453 (0.455)
validation: [13/18]#011Time 1.097 (1.024)#011Loss 1.2690 (1.2048)#011Prec 0.383 (0.450)
validation: [14/18]#011Time 1.066 (1.027)#011Loss 1.2402 (1.2072)#011Prec 0.438 (0.449)
validation: [15/18]#011Time 0.991 (1.025)#011Loss 1.1949 (1.2064)#011Prec 0.484 (0.451)
validation: [16/18]#011Time 1.035 (1.025)#011Loss 1.1989 (1.2059)#011Prec 0.445 (0.451)
validation: [17/18]#011Time 0.150 (0.977)#011Loss 1.2540 (1.2064)#011Prec 0.350 (0.450)
*Validation Precision: 0.450
Epoch: 21
Epoch: [21][0/12] lr 0.00062#011Time 1.350 (1.350)#011Data 0.798 (0.798)#011Loss 1.2771 (1.2771)#011Prec 0.398 (0.398)
Epoch: [21][1/12] lr 0.00062#011Time 1.344 (1.347)#011Data 0.794 (0.796)#011Loss 1.1966 (1.2368)#011Prec 0.547 (0.473)
Epoch: [21][2/12] lr 0.00062#011Time 1.354 (1.349)#011Data 0.803 (0.798)#011Loss 1.1987 (1.2241)#011Prec 0.523 (0.490)
Epoch: [21][3/12] lr 0.00062#011Time 1.399 (1.362)#011Data 0.848 (0.811)#011Loss 1.1047 (1.1943)#011Prec 0.500 (0.492)
Epoch: [21][4/12] lr 0.00062#011Time 1.345 (1.358)#011Data 0.793 (0.807)#011Loss 1.2035 (1.1961)#011Prec 0.477 (0.489)
Epoch: [21][5/12] lr 0.00062#011Time 1.245 (1.339)#011Data 0.693 (0.788)#011Loss 1.2715 (1.2087)#011Prec 0.453 (0.483)
Epoch: [21][6/12] lr 0.00062#011Time 1.362 (1.343)#011Data 0.811 (0.791)#011Loss 1.1925 (1.2064)#011Prec 0.422 (0.474)
Epoch: [21][7/12] lr 0.00062#011Time 1.289 (1.336)#011Data 0.736 (0.785)#011Loss 1.2309 (1.2094)#011Prec 0.438 (0.470)
Epoch: [21][8/12] lr 0.00062#011Time 1.324 (1.335)#011Data 0.770 (0.783)#011Loss 1.1515 (1.2030)#011Prec 0.461 (0.469)
Epoch: [21][9/12] lr 0.00062#011Time 1.346 (1.336)#011Data 0.792 (0.784)#011Loss 1.2296 (1.2057)#011Prec 0.391 (0.461)
Epoch: [21][10/12] lr 0.00062#011Time 1.271 (1.330)#011Data 0.719 (0.778)#011Loss 1.2573 (1.2104)#011Prec 0.453 (0.460)
Epoch: [21][11/12] lr 0.00062#011Time 0.596 (1.269)#011Data 0.350 (0.742)#011Loss 1.1142 (1.2069)#011Prec 0.528 (0.463)
validation: [0/18]#011Time 0.980 (0.980)#011Loss 1.2281 (1.2281)#011Prec 0.422 (0.422)
validation: [1/18]#011Time 0.946 (0.963)#011Loss 1.2154 (1.2217)#011Prec 0.422 (0.422)
validation: [2/18]#011Time 1.069 (0.998)#011Loss 1.2359 (1.2264)#011Prec 0.430 (0.424)
validation: [3/18]#011Time 1.003 (1.000)#011Loss 1.1902 (1.2174)#011Prec 0.453 (0.432)
validation: [4/18]#011Time 0.983 (0.996)#011Loss 1.2299 (1.2199)#011Prec 0.406 (0.427)
validation: [5/18]#011Time 1.048 (1.005)#011Loss 1.1668 (1.2110)#011Prec 0.430 (0.427)
validation: [6/18]#011Time 1.048 (1.011)#011Loss 1.1536 (1.2028)#011Prec 0.523 (0.441)
validation: [7/18]#011Time 1.017 (1.012)#011Loss 1.2787 (1.2123)#011Prec 0.477 (0.445)
validation: [8/18]#011Time 1.089 (1.020)#011Loss 1.1856 (1.2093)#011Prec 0.422 (0.443)
validation: [9/18]#011Time 1.059 (1.024)#011Loss 1.1587 (1.2043)#011Prec 0.508 (0.449)
validation: [10/18]#011Time 0.968 (1.019)#011Loss 1.2696 (1.2102)#011Prec 0.398 (0.445)
validation: [11/18]#011Time 0.986 (1.016)#011Loss 1.2801 (1.2160)#011Prec 0.367 (0.438)
validation: [12/18]#011Time 1.019 (1.016)#011Loss 1.1456 (1.2106)#011Prec 0.422 (0.437)
validation: [13/18]#011Time 1.010 (1.016)#011Loss 1.2091 (1.2105)#011Prec 0.453 (0.438)
validation: [14/18]#011Time 1.047 (1.018)#011Loss 1.1367 (1.2056)#011Prec 0.484 (0.441)
validation: [15/18]#011Time 1.149 (1.026)#011Loss 1.2544 (1.2087)#011Prec 0.484 (0.444)
validation: [16/18]#011Time 1.044 (1.027)#011Loss 1.1529 (1.2054)#011Prec 0.539 (0.449)
validation: [17/18]#011Time 0.168 (0.980)#011Loss 1.0382 (1.2038)#011Prec 0.650 (0.451)
*Validation Precision: 0.451
Epoch: 22
Epoch: [22][0/12] lr 0.00062#011Time 1.377 (1.377)#011Data 0.823 (0.823)#011Loss 1.2618 (1.2618)#011Prec 0.461 (0.461)
Epoch: [22][1/12] lr 0.00062#011Time 1.294 (1.335)#011Data 0.739 (0.781)#011Loss 1.1526 (1.2072)#011Prec 0.445 (0.453)
Epoch: [22][2/12] lr 0.00062#011Time 1.375 (1.349)#011Data 0.822 (0.795)#011Loss 1.1861 (1.2002)#011Prec 0.453 (0.453)
Epoch: [22][3/12] lr 0.00062#011Time 1.297 (1.336)#011Data 0.745 (0.782)#011Loss 1.2817 (1.2206)#011Prec 0.391 (0.438)
Epoch: [22][4/12] lr 0.00062#011Time 1.363 (1.341)#011Data 0.811 (0.788)#011Loss 1.2184 (1.2201)#011Prec 0.516 (0.453)
Epoch: [22][5/12] lr 0.00062#011Time 1.340 (1.341)#011Data 0.787 (0.788)#011Loss 1.2599 (1.2268)#011Prec 0.391 (0.443)
Epoch: [22][6/12] lr 0.00062#011Time 1.329 (1.339)#011Data 0.772 (0.786)#011Loss 1.1595 (1.2171)#011Prec 0.461 (0.445)
Epoch: [22][7/12] lr 0.00062#011Time 1.322 (1.337)#011Data 0.767 (0.783)#011Loss 1.1808 (1.2126)#011Prec 0.484 (0.450)
Epoch: [22][8/12] lr 0.00062#011Time 1.346 (1.338)#011Data 0.791 (0.784)#011Loss 1.1743 (1.2083)#011Prec 0.500 (0.456)
Epoch: [22][9/12] lr 0.00062#011Time 1.299 (1.334)#011Data 0.746 (0.780)#011Loss 1.0996 (1.1975)#011Prec 0.594 (0.470)
Epoch: [22][10/12] lr 0.00062#011Time 1.292 (1.330)#011Data 0.738 (0.776)#011Loss 1.1232 (1.1907)#011Prec 0.508 (0.473)
Epoch: [22][11/12] lr 0.00062#011Time 0.571 (1.267)#011Data 0.324 (0.739)#011Loss 1.1385 (1.1888)#011Prec 0.472 (0.473)
validation: [0/18]#011Time 1.021 (1.021)#011Loss 1.2258 (1.2258)#011Prec 0.453 (0.453)
validation: [1/18]#011Time 0.981 (1.001)#011Loss 1.2191 (1.2225)#011Prec 0.414 (0.434)
validation: [2/18]#011Time 1.047 (1.017)#011Loss 1.2631 (1.2360)#011Prec 0.477 (0.448)
validation: [3/18]#011Time 0.998 (1.012)#011Loss 1.2205 (1.2321)#011Prec 0.461 (0.451)
validation: [4/18]#011Time 1.152 (1.040)#011Loss 1.0667 (1.1991)#011Prec 0.477 (0.456)
validation: [5/18]#011Time 1.067 (1.045)#011Loss 1.1662 (1.1936)#011Prec 0.422 (0.451)
validation: [6/18]#011Time 0.967 (1.033)#011Loss 1.2598 (1.2030)#011Prec 0.469 (0.453)
validation: [7/18]#011Time 0.951 (1.023)#011Loss 1.2291 (1.2063)#011Prec 0.461 (0.454)
validation: [8/18]#011Time 1.042 (1.025)#011Loss 1.1394 (1.1989)#011Prec 0.500 (0.459)
validation: [9/18]#011Time 1.105 (1.033)#011Loss 1.1638 (1.1954)#011Prec 0.492 (0.463)
validation: [10/18]#011Time 1.161 (1.045)#011Loss 1.1728 (1.1933)#011Prec 0.461 (0.462)
validation: [11/18]#011Time 0.974 (1.039)#011Loss 1.1645 (1.1909)#011Prec 0.453 (0.462)
validation: [12/18]#011Time 1.009 (1.037)#011Loss 1.2184 (1.1930)#011Prec 0.445 (0.460)
validation: [13/18]#011Time 1.090 (1.040)#011Loss 1.1774 (1.1919)#011Prec 0.453 (0.460)
validation: [14/18]#011Time 1.050 (1.041)#011Loss 1.3198 (1.2004)#011Prec 0.367 (0.454)
validation: [15/18]#011Time 1.043 (1.041)#011Loss 1.2585 (1.2041)#011Prec 0.383 (0.449)
validation: [16/18]#011Time 1.042 (1.041)#011Loss 1.2361 (1.2059)#011Prec 0.461 (0.450)
validation: [17/18]#011Time 0.168 (0.993)#011Loss 1.3088 (1.2069)#011Prec 0.350 (0.449)
*Validation Precision: 0.449
Epoch: 23
Epoch: [23][0/12] lr 0.00062#011Time 1.361 (1.361)#011Data 0.807 (0.807)#011Loss 1.2349 (1.2349)#011Prec 0.469 (0.469)
Epoch: [23][1/12] lr 0.00062#011Time 1.303 (1.332)#011Data 0.748 (0.777)#011Loss 1.1946 (1.2147)#011Prec 0.484 (0.477)
Epoch: [23][2/12] lr 0.00062#011Time 1.322 (1.329)#011Data 0.768 (0.774)#011Loss 1.1691 (1.1995)#011Prec 0.508 (0.487)
Epoch: [23][3/12] lr 0.00062#011Time 1.383 (1.342)#011Data 0.829 (0.788)#011Loss 1.2263 (1.2062)#011Prec 0.375 (0.459)
Epoch: [23][4/12] lr 0.00062#011Time 1.416 (1.357)#011Data 0.858 (0.802)#011Loss 1.2931 (1.2236)#011Prec 0.406 (0.448)
Epoch: [23][5/12] lr 0.00062#011Time 1.417 (1.367)#011Data 0.862 (0.812)#011Loss 1.3085 (1.2377)#011Prec 0.375 (0.436)
Epoch: [23][6/12] lr 0.00062#011Time 1.289 (1.356)#011Data 0.733 (0.800)#011Loss 1.1496 (1.2251)#011Prec 0.492 (0.444)
Epoch: [23][7/12] lr 0.00062#011Time 1.343 (1.354)#011Data 0.788 (0.799)#011Loss 1.1941 (1.2213)#011Prec 0.484 (0.449)
Epoch: [23][8/12] lr 0.00062#011Time 1.333 (1.352)#011Data 0.780 (0.797)#011Loss 1.2006 (1.2190)#011Prec 0.500 (0.455)
Epoch: [23][9/12] lr 0.00062#011Time 1.381 (1.355)#011Data 0.825 (0.800)#011Loss 1.1851 (1.2156)#011Prec 0.500 (0.459)
Epoch: [23][10/12] lr 0.00062#011Time 1.326 (1.352)#011Data 0.769 (0.797)#011Loss 1.1325 (1.2080)#011Prec 0.500 (0.463)
Epoch: [23][11/12] lr 0.00062#011Time 0.551 (1.285)#011Data 0.302 (0.756)#011Loss 1.2757 (1.2105)#011Prec 0.472 (0.463)
validation: [0/18]#011Time 0.991 (0.991)#011Loss 1.1991 (1.1991)#011Prec 0.383 (0.383)
validation: [1/18]#011Time 1.029 (1.010)#011Loss 1.1633 (1.1812)#011Prec 0.461 (0.422)
validation: [2/18]#011Time 1.034 (1.018)#011Loss 1.2408 (1.2010)#011Prec 0.477 (0.440)
validation: [3/18]#011Time 0.979 (1.008)#011Loss 1.2490 (1.2130)#011Prec 0.469 (0.447)
validation: [4/18]#011Time 1.029 (1.012)#011Loss 1.2518 (1.2208)#011Prec 0.438 (0.445)
validation: [5/18]#011Time 1.028 (1.015)#011Loss 1.1815 (1.2142)#011Prec 0.477 (0.451)
validation: [6/18]#011Time 0.967 (1.008)#011Loss 1.1984 (1.2120)#011Prec 0.453 (0.451)
validation: [7/18]#011Time 0.993 (1.006)#011Loss 1.2578 (1.2177)#011Prec 0.445 (0.450)
validation: [8/18]#011Time 1.048 (1.011)#011Loss 1.1824 (1.2138)#011Prec 0.484 (0.454)
validation: [9/18]#011Time 0.969 (1.007)#011Loss 1.2757 (1.2200)#011Prec 0.406 (0.449)
validation: [10/18]#011Time 1.017 (1.008)#011Loss 1.1553 (1.2141)#011Prec 0.469 (0.451)
validation: [11/18]#011Time 1.028 (1.009)#011Loss 1.1815 (1.2114)#011Prec 0.445 (0.451)
validation: [12/18]#011Time 0.985 (1.007)#011Loss 1.1748 (1.2086)#011Prec 0.477 (0.453)
validation: [13/18]#011Time 1.034 (1.009)#011Loss 1.1887 (1.2071)#011Prec 0.422 (0.450)
validation: [14/18]#011Time 1.121 (1.017)#011Loss 1.2340 (1.2089)#011Prec 0.375 (0.445)
validation: [15/18]#011Time 1.038 (1.018)#011Loss 1.2127 (1.2092)#011Prec 0.414 (0.443)
validation: [16/18]#011Time 1.067 (1.021)#011Loss 1.2259 (1.2101)#011Prec 0.453 (0.444)
validation: [17/18]#011Time 0.197 (0.975)#011Loss 1.1215 (1.2093)#011Prec 0.450 (0.444)
*Validation Precision: 0.444
Epoch: 24
Epoch: [24][0/12] lr 0.00062#011Time 1.288 (1.288)#011Data 0.733 (0.733)#011Loss 1.1509 (1.1509)#011Prec 0.508 (0.508)
Epoch: [24][1/12] lr 0.00062#011Time 1.401 (1.344)#011Data 0.847 (0.790)#011Loss 1.2122 (1.1816)#011Prec 0.539 (0.523)
Epoch: [24][2/12] lr 0.00062#011Time 1.374 (1.354)#011Data 0.820 (0.800)#011Loss 1.3114 (1.2248)#011Prec 0.375 (0.474)
Epoch: [24][3/12] lr 0.00062#011Time 1.260 (1.331)#011Data 0.706 (0.777)#011Loss 1.1978 (1.2181)#011Prec 0.398 (0.455)
Epoch: [24][4/12] lr 0.00062#011Time 1.288 (1.322)#011Data 0.735 (0.768)#011Loss 1.2133 (1.2171)#011Prec 0.453 (0.455)
Epoch: [24][5/12] lr 0.00062#011Time 1.324 (1.323)#011Data 0.771 (0.769)#011Loss 1.2027 (1.2147)#011Prec 0.477 (0.458)
Epoch: [24][6/12] lr 0.00062#011Time 1.320 (1.322)#011Data 0.765 (0.768)#011Loss 1.2693 (1.2225)#011Prec 0.430 (0.454)
Epoch: [24][7/12] lr 0.00062#011Time 1.292 (1.318)#011Data 0.737 (0.764)#011Loss 1.2545 (1.2265)#011Prec 0.438 (0.452)
Epoch: [24][8/12] lr 0.00062#011Time 1.316 (1.318)#011Data 0.762 (0.764)#011Loss 1.2659 (1.2309)#011Prec 0.453 (0.452)
Epoch: [24][9/12] lr 0.00062#011Time 1.326 (1.319)#011Data 0.771 (0.765)#011Loss 1.1579 (1.2236)#011Prec 0.500 (0.457)
Epoch: [24][10/12] lr 0.00062#011Time 1.393 (1.326)#011Data 0.837 (0.771)#011Loss 1.1815 (1.2198)#011Prec 0.477 (0.459)
Epoch: [24][11/12] lr 0.00062#011Time 0.592 (1.265)#011Data 0.344 (0.736)#011Loss 1.0179 (1.2124)#011Prec 0.585 (0.463)
validation: [0/18]#011Time 1.109 (1.109)#011Loss 1.2503 (1.2503)#011Prec 0.445 (0.445)
validation: [1/18]#011Time 1.017 (1.063)#011Loss 1.2659 (1.2581)#011Prec 0.430 (0.438)
validation: [2/18]#011Time 1.043 (1.056)#011Loss 1.2200 (1.2454)#011Prec 0.445 (0.440)
validation: [3/18]#011Time 0.977 (1.036)#011Loss 1.1812 (1.2294)#011Prec 0.422 (0.436)
validation: [4/18]#011Time 1.005 (1.030)#011Loss 1.2167 (1.2268)#011Prec 0.430 (0.434)
validation: [5/18]#011Time 1.042 (1.032)#011Loss 1.1246 (1.2098)#011Prec 0.555 (0.454)
validation: [6/18]#011Time 0.980 (1.025)#011Loss 1.1690 (1.2040)#011Prec 0.461 (0.455)
validation: [7/18]#011Time 1.040 (1.027)#011Loss 1.1651 (1.1991)#011Prec 0.469 (0.457)
validation: [8/18]#011Time 1.040 (1.028)#011Loss 1.2126 (1.2006)#011Prec 0.461 (0.457)
validation: [9/18]#011Time 1.051 (1.030)#011Loss 1.3510 (1.2157)#011Prec 0.352 (0.447)
validation: [10/18]#011Time 1.074 (1.034)#011Loss 1.2580 (1.2195)#011Prec 0.328 (0.436)
validation: [11/18]#011Time 1.106 (1.040)#011Loss 1.1468 (1.2134)#011Prec 0.508 (0.442)
validation: [12/18]#011Time 0.975 (1.035)#011Loss 1.3022 (1.2203)#011Prec 0.445 (0.442)
validation: [13/18]#011Time 1.048 (1.036)#011Loss 1.1933 (1.2183)#011Prec 0.492 (0.446)
validation: [14/18]#011Time 1.023 (1.035)#011Loss 1.1859 (1.2162)#011Prec 0.438 (0.445)
validation: [15/18]#011Time 0.991 (1.033)#011Loss 1.1582 (1.2126)#011Prec 0.484 (0.448)
validation: [16/18]#011Time 0.958 (1.028)#011Loss 1.1919 (1.2113)#011Prec 0.445 (0.448)
validation: [17/18]#011Time 0.153 (0.980)#011Loss 1.2711 (1.2119)#011Prec 0.400 (0.447)
*Validation Precision: 0.447
Testing Model
Testing: [0/12]#011Time 1.038 (1.038)#011Loss 0.9649 (0.9649)#011Prec 0.617 (0.617)
Testing: [1/12]#011Time 1.094 (1.066)#011Loss 1.0171 (0.9910)#011Prec 0.547 (0.582)
Testing: [2/12]#011Time 0.995 (1.042)#011Loss 0.9589 (0.9803)#011Prec 0.562 (0.576)
Testing: [3/12]#011Time 1.112 (1.060)#011Loss 1.0075 (0.9871)#011Prec 0.570 (0.574)
Testing: [4/12]#011Time 1.047 (1.057)#011Loss 0.9770 (0.9851)#011Prec 0.555 (0.570)
Testing: [5/12]#011Time 1.021 (1.051)#011Loss 1.1003 (1.0043)#011Prec 0.508 (0.560)
Testing: [6/12]#011Time 0.979 (1.041)#011Loss 0.9693 (0.9993)#011Prec 0.570 (0.561)
Testing: [7/12]#011Time 1.045 (1.041)#011Loss 1.1018 (1.0121)#011Prec 0.562 (0.562)
Testing: [8/12]#011Time 1.081 (1.046)#011Loss 0.9276 (1.0027)#011Prec 0.586 (0.564)

2023-04-06 00:22:45 Uploading - Uploading generated training modelTesting: [9/12]#011Time 1.075 (1.049)#011Loss 1.0473 (1.0072)#011Prec 0.586 (0.566)
Testing: [10/12]#011Time 1.166 (1.059)#011Loss 1.0182 (1.0082)#011Prec 0.547 (0.565)
Testing: [11/12]#011Time 0.412 (1.005)#011Loss 1.0192 (1.0086)#011Prec 0.566 (0.565)
*Testing Precision: 0.565
Saving Model
INFO:__main__:Hyperparameters are LR: 0.06246976097402943, Batch Size: 128
INFO:__main__:Data Paths: /opt/ml/input/data/training
INFO:__main__:Starting Model Training
INFO:__main__:Epoch: 0
INFO:__main__:Epoch: 1
INFO:__main__:Epoch: 2
INFO:__main__:Epoch: 3
INFO:__main__:Epoch: 4
INFO:__main__:Epoch: 5
INFO:__main__:Epoch: 6
INFO:__main__:Epoch: 7
INFO:__main__:Epoch: 8
INFO:__main__:Epoch: 9
INFO:__main__:Epoch: 10
INFO:__main__:Epoch: 11
INFO:__main__:Epoch: 12
INFO:__main__:Epoch: 13
INFO:__main__:Epoch: 14
INFO:__main__:Epoch: 15
INFO:__main__:Epoch: 16
INFO:__main__:Epoch: 17
INFO:__main__:Epoch: 18
INFO:__main__:Epoch: 19
INFO:__main__:Epoch: 20
INFO:__main__:Epoch: 21
INFO:__main__:Epoch: 22
INFO:__main__:Epoch: 23
INFO:__main__:Epoch: 24
INFO:__main__:Testing Model
INFO:__main__:Saving Model
2023-04-06 00:22:34,979 sagemaker-training-toolkit INFO     Reporting training SUCCESS

2023-04-06 00:23:05 Completed - Training job completed
Training seconds: 1227
Billable seconds: 1227
In [7]:
# attaching the estimator to a previous training job 
TrainingJobName='inventory-monitoring-2023-04-06-00-01-29-320'
estimator = sagemaker.estimator.Estimator.attach(TrainingJobName)
estimator.hyperparameters()
2023-04-06 00:23:25 Starting - Preparing the instances for training
2023-04-06 00:23:25 Downloading - Downloading input data
2023-04-06 00:23:25 Training - Training image download completed. Training in progress.
2023-04-06 00:23:25 Uploading - Uploading generated training model
2023-04-06 00:23:25 Completed - Training job completed
Out[7]:
{'batch_size': '128',
 'epochs': '"25"',
 'learning_rate': '"0.06246976097402943"',
 'sagemaker_container_log_level': '20',
 'sagemaker_job_name': '"inventory-monitoring-2023-04-06-00-01-29-320"',
 'sagemaker_program': '"train.py"',
 'sagemaker_region': '"us-east-1"',
 'sagemaker_submit_directory': '"s3://udacity-capstone-project-2023/inventory-monitoring-2023-04-06-00-01-29-320/source/sourcedir.tar.gz"'}

Checking Training Performance¶

Model Profiling and Debugging¶

In [98]:
session = boto3.session.Session()
region = session.region_name

job_name = estimator.latest_training_job.name
client = estimator.sagemaker_session.sagemaker_client
description = client.describe_training_job(TrainingJobName=estimator.latest_training_job.name)
print(f"Training jobname: {job_name}")
print(f"Region: {region}")
Training jobname: inventory-monitoring-2023-04-06-00-01-29-320
Region: us-east-1
In [99]:
from smdebug.trials import create_trial
from smdebug.core.modes import ModeKeys

trial = create_trial("s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/debug-output/")
#trial = create_trial(estimator.latest_job_debugger_artifacts_path())
#trial.tensor_names()
[2023-04-07 23:12:55.454 pytorch-1-6-cpu-py36--ml-t3-medium-370ee60fbc7a856e8f67ac271515:64 INFO utils.py:27] RULE_JOB_STOP_SIGNAL_FILENAME: None
[2023-04-07 23:12:55.489 pytorch-1-6-cpu-py36--ml-t3-medium-370ee60fbc7a856e8f67ac271515:64 INFO s3_trial.py:42] Loading trial  at path s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/debug-output/
In [41]:
len(trial.tensor("CrossEntropyLoss_output_0").steps(mode=ModeKeys.TRAIN))
Out[41]:
30
In [42]:
len(trial.tensor("CrossEntropyLoss_output_0").steps(mode=ModeKeys.EVAL))
Out[42]:
462
In [47]:
# Plot a debugging output.
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot


def get_data(trial, tname, mode):
    tensor = trial.tensor(tname)
    steps = tensor.steps(mode=mode)
    vals = []
    for s in steps:
        vals.append(tensor.value(s, mode=mode))
    return steps, vals

def plot_tensor(trial, tensor_name):

    steps_train, vals_train = get_data(trial, tensor_name, mode=ModeKeys.TRAIN)
    print("loaded TRAIN data")
    steps_eval, vals_eval = get_data(trial, tensor_name, mode=ModeKeys.EVAL)
    print("loaded EVAL data")

    fig = plt.figure(figsize=(10, 7))
    host = host_subplot(111)

    par = host.twiny()

    host.set_xlabel("Steps (TRAIN)")
    par.set_xlabel("Steps (EVAL)")
    host.set_ylabel(tensor_name)

    (p1,) = host.plot(steps_train, vals_train, label=tensor_name)
    print("completed TRAIN plot")
    (p2,) = par.plot(steps_eval, vals_eval, label="val_" + tensor_name)
    print("completed EVAL plot")
    leg = plt.legend()

    host.xaxis.get_label().set_color(p1.get_color())
    leg.texts[0].set_color(p1.get_color())

    par.xaxis.get_label().set_color(p2.get_color())
    leg.texts[1].set_color(p2.get_color())

    plt.ylabel(tensor_name)
    plt.savefig('results/CrossEntropy_Loss_during_training_and_validation.png')
    plt.show()
In [48]:
plot_tensor(trial, "CrossEntropyLoss_output_0")
loaded TRAIN data
loaded EVAL data
completed TRAIN plot
completed EVAL plot
In [45]:
from smdebug.profiler.analysis.notebook_utils.training_job import TrainingJob

tj = TrainingJob(job_name, region)
tj.wait_for_sys_profiling_data_to_be_available()
ProfilerConfig:{'S3OutputPath': 's3://udacity-capstone-project-2023/output-best/', 'ProfilingIntervalInMilliseconds': 500, 'ProfilingParameters': {'DataloaderProfilingConfig': '{"StartStep": 0, "NumSteps": 10, "MetricsRegex": ".*", }', 'DetailedProfilingConfig': '{"StartStep": 0, "NumSteps": 10, }', 'FileOpenFailThreshold': '50', 'HorovodProfilingConfig': '{"StartStep": 0, "NumSteps": 10, }', 'LocalPath': '/opt/ml/output/profiler', 'PythonProfilingConfig': '{"StartStep": 0, "NumSteps": 10, "ProfilerName": "cprofile", "cProfileTimer": "total_time", }', 'RotateFileCloseIntervalInSeconds': '60', 'RotateMaxFileSizeInBytes': '10485760', 'SMDataParallelProfilingConfig': '{"StartStep": 0, "NumSteps": 10, }'}}
s3 path:s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/profiler-output


Profiler data from system is available
In [46]:
from smdebug.profiler.analysis.notebook_utils.timeline_charts import TimelineCharts

system_metrics_reader = tj.get_systems_metrics_reader()
system_metrics_reader.refresh_event_file_list()

view_timeline_charts = TimelineCharts(
    system_metrics_reader,
    framework_metrics_reader=None,
    select_dimensions=["CPU", "GPU"],
    select_events=["total"],
)
[2023-04-06 00:29:16.882 pytorch-1-6-cpu-py36--ml-t3-medium-370ee60fbc7a856e8f67ac271515:34 INFO metrics_reader_base.py:134] Getting 21 event files
select events:['total']
select dimensions:['CPU', 'GPU']
filtered_events:{'total'}
filtered_dimensions:{'CPUUtilization-nodeid:algo-1', 'GPUMemoryUtilization-nodeid:algo-1', 'GPUUtilization-nodeid:algo-1'}
In [100]:
rule_output_path = estimator.output_path + estimator.latest_training_job.job_name + "/rule-output"
print(f"You will find the profiler report in {rule_output_path}")

! aws s3 ls {rule_output_path} --recursive
! aws s3 cp {rule_output_path} ./ --recursive

# get the autogenerated folder name of profiler report
profiler_report_name = [
    rule["RuleConfigurationName"]
    for rule in estimator.latest_training_job.rule_job_summary()
    if "Profiler" in rule["RuleConfigurationName"]
][0]


IPython.display.HTML(filename=profiler_report_name + "/profiler-output/profiler-report.html")
You will find the profiler report in s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output
2023-04-06 00:22:53     416803 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-report.html
2023-04-06 00:22:53     271471 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-report.ipynb
2023-04-06 00:22:48        192 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/BatchSize.json
2023-04-06 00:22:48      33326 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/CPUBottleneck.json
2023-04-06 00:22:48       2063 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/Dataloader.json
2023-04-06 00:22:48        332 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/GPUMemoryIncrease.json
2023-04-06 00:22:48       1186 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/IOBottleneck.json
2023-04-06 00:22:48        346 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/LoadBalancing.json
2023-04-06 00:22:48        341 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/LowGPUUtilization.json
2023-04-06 00:22:48        231 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/MaxInitializationTime.json
2023-04-06 00:22:48       2286 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/OverallFrameworkMetrics.json
2023-04-06 00:22:48        618 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/OverallSystemUsage.json
2023-04-06 00:22:48       2463 output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/StepOutlier.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/GPUMemoryIncrease.json to ProfilerReport/profiler-output/profiler-reports/GPUMemoryIncrease.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-report.ipynb to ProfilerReport/profiler-output/profiler-report.ipynb
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/CPUBottleneck.json to ProfilerReport/profiler-output/profiler-reports/CPUBottleneck.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/Dataloader.json to ProfilerReport/profiler-output/profiler-reports/Dataloader.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/BatchSize.json to ProfilerReport/profiler-output/profiler-reports/BatchSize.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/MaxInitializationTime.json to ProfilerReport/profiler-output/profiler-reports/MaxInitializationTime.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/IOBottleneck.json to ProfilerReport/profiler-output/profiler-reports/IOBottleneck.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/LoadBalancing.json to ProfilerReport/profiler-output/profiler-reports/LoadBalancing.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/OverallFrameworkMetrics.json to ProfilerReport/profiler-output/profiler-reports/OverallFrameworkMetrics.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/OverallSystemUsage.json to ProfilerReport/profiler-output/profiler-reports/OverallSystemUsage.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/LowGPUUtilization.json to ProfilerReport/profiler-output/profiler-reports/LowGPUUtilization.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-reports/StepOutlier.json to ProfilerReport/profiler-output/profiler-reports/StepOutlier.json
download: s3://udacity-capstone-project-2023/output-best/inventory-monitoring-2023-04-06-00-01-29-320/rule-output/ProfilerReport/profiler-output/profiler-report.html to ProfilerReport/profiler-output/profiler-report.html
Out[100]:
profiler-report

SageMaker Debugger Profiling Report ¶

SageMaker Debugger auto generated this report. You can generate similar reports on all supported training jobs. The report provides summary of training job, system resource usage statistics, framework metrics, rules summary, and detailed analysis from each rule. The graphs and tables are interactive.

Legal disclaimer: This report and any recommendations are provided for informational purposes only and are not definitive. You are responsible for making your own independent assessment of the information.

In [4]:
# Parameters
processing_job_arn = "arn:aws:sagemaker:us-east-1:733710257842:processing-job/inventory-monitoring-2023--profilerreport-dc6f6a5b"

Training job summary ¶

System usage statistics ¶

Framework metrics summary ¶

Overview: CPU operators ¶

Overview: GPU operators ¶

Rules summary ¶

The following table shows a profiling summary of the Debugger built-in rules. The table is sorted by the rules that triggered the most frequently. During your training job, the GPUMemoryIncrease rule was the most frequently triggered. It processed 2322 datapoints and was triggered 154 times.

Description Recommendation Number of times rule triggered Number of datapoints Rule parameters
GPUMemoryIncrease Measures the average GPU memory footprint and triggers if there is a large increase. Choose a larger instance type with more memory if footprint is close to maximum available memory. 154 2322 increase:5
patience:1000
window:10
LowGPUUtilization Checks if the GPU utilization is low or fluctuating. This can happen due to bottlenecks, blocking calls for synchronizations, or a small batch size. Check if there are bottlenecks, minimize blocking calls, change distributed training strategy, or increase the batch size. 12 2322 threshold_p95:70
threshold_p5:10
window:500
patience:1000
Dataloader Checks how many data loaders are running in parallel and whether the total number is equal the number of available CPU cores. The rule triggers if number is much smaller or larger than the number of available cores. If too small, it might lead to low GPU utilization. If too large, it might impact other compute intensive operations on CPU. Change the number of data loader processes. 1 11 min_threshold:70
max_threshold:200
LoadBalancing Detects workload balancing issues across GPUs. Workload imbalance can occur in training jobs with data parallelism. The gradients are accumulated on a primary GPU, and this GPU might be overused with regard to other GPUs, resulting in reducing the efficiency of data parallelization. Choose a different distributed training strategy or a different distributed training framework. 0 2322 threshold:0.2
patience:1000
CPUBottleneck Checks if the CPU utilization is high and the GPU utilization is low. It might indicate CPU bottlenecks, where the GPUs are waiting for data to arrive from the CPUs. The rule evaluates the CPU and GPU utilization rates, and triggers the issue if the time spent on the CPU bottlenecks exceeds a threshold percent of the total training time. The default threshold is 50 percent. Consider increasing the number of data loaders or applying data pre-fetching. 0 2330 threshold:50
cpu_threshold:90
gpu_threshold:10
patience:1000
BatchSize Checks if GPUs are underutilized because the batch size is too small. To detect this problem, the rule analyzes the average GPU memory footprint, the CPU and the GPU utilization. The batch size is too small, and GPUs are underutilized. Consider running on a smaller instance type or increasing the batch size. 0 2321 cpu_threshold_p95:70
gpu_threshold_p95:70
gpu_memory_threshold_p95:70
patience:1000
window:500
MaxInitializationTime Checks if the time spent on initialization exceeds a threshold percent of the total training time. The rule waits until the first step of training loop starts. The initialization can take longer if downloading the entire dataset from Amazon S3 in File mode. The default threshold is 20 minutes. Initialization takes too long. If using File mode, consider switching to Pipe mode in case you are using TensorFlow framework. 0 729 threshold:20
IOBottleneck Checks if the data I/O wait time is high and the GPU utilization is low. It might indicate IO bottlenecks where GPU is waiting for data to arrive from storage. The rule evaluates the I/O and GPU utilization rates and triggers the issue if the time spent on the IO bottlenecks exceeds a threshold percent of the total training time. The default threshold is 50 percent. Pre-fetch data or choose different file formats, such as binary formats that improve I/O performance. 0 2330 threshold:50
io_threshold:50
gpu_threshold:10
patience:1000
StepOutlier Detects outliers in step duration. The step duration for forward and backward pass should be roughly the same throughout the training. If there are significant outliers, it may indicate a system stall or bottleneck issues. Check if there are any bottlenecks (CPU, I/O) correlated to the step outliers. 0 729 threshold:3
mode:None
n_outliers:10
stddev:3

Analyzing the training loop ¶

Step duration analysis ¶

Step durations on node algo-1-27:

The following table is a summary of the statistics of step durations measured on node algo-1-27. The rule has analyzed the step duration from Step:ModeKeys.EVAL phase. The average step duration on node algo-1-27 was 0.03s. The rule detected 1 outliers, where step duration was larger than 3 times the standard deviation of 0.33s

mean max p99 p95 p50 min
Step Durations in [s] 0.03 6.86 0.02 0.02 0.02 0.01

The following histogram shows the step durations measured on the different nodes. You can turn on or turn off the visualization of histograms by selecting or unselecting the labels in the legend.

GPU utilization analysis ¶

Usage per GPU

GPU utilization of gpu0 on node algo-1:

Workload balancing

Dataloading analysis ¶

Batch size ¶

CPU bottlenecks ¶

I/O bottlenecks ¶

GPU memory ¶

Memory utilization of gpu0 on node algo-1:

Model Deploying and Querying¶

In [9]:
model_location=estimator.model_data
In [10]:
jpeg_serializer = sagemaker.serializers.IdentitySerializer("image/jpeg")
json_deserializer = sagemaker.deserializers.JSONDeserializer()


class ImagePredictor(Predictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super(ImagePredictor, self).__init__(
            endpoint_name,
            sagemaker_session=sagemaker_session,
            serializer=jpeg_serializer,
            deserializer=json_deserializer,
        )
In [12]:
role = sagemaker.get_execution_role()

pytorch_model = PyTorchModel(model_data=model_location, role=role, entry_point='code/inference.py',py_version='py36',
                             framework_version='1.8',
                             predictor_cls=ImagePredictor)
In [13]:
predictor = pytorch_model.deploy(initial_instance_count=1, instance_type='ml.m5.large')
------!
In [14]:
from PIL import Image
import io

def image_to_byte_array(image:Image):
    imgByteArr = io.BytesIO()
    image.save(imgByteArr, format=image.format)
    imgByteArr = imgByteArr.getvalue()
    return imgByteArr

img = Image.open("dataset/test/4/00010.jpg", mode='r')
img_bytes = image_to_byte_array(img)
Image.open(io.BytesIO(img_bytes))
Out[14]:
In [15]:
response=predictor.predict(img_bytes, initial_args={"ContentType": "image/jpeg"})
In [16]:
response
Out[16]:
[[-7.966815948486328,
  -4.312801361083984,
  2.588212490081787,
  4.92999267578125,
  6.17242956161499,
  -1.381866216659546]]
In [17]:
index = np.argmax(response, 1)[0]
In [18]:
print(index)
4
In [19]:
img = Image.open("dataset/test/5/00132.jpg", mode='r')
img_bytes = image_to_byte_array(img)
Image.open(io.BytesIO(img_bytes))
response2=predictor.predict(img_bytes, initial_args={"ContentType": "image/jpeg"})
In [20]:
response2
Out[20]:
[[2.5474486351013184,
  1.9740455150604248,
  0.8217211961746216,
  0.3288322389125824,
  -0.349445641040802,
  -5.314144611358643]]
In [21]:
index = np.argmax(response2, 1)[0]
print(index)
0
In [77]:
test_folder = 'dataset/test'
 
Categories =  os.listdir(test_folder)
 
test_images = pd.DataFrame()
for category in Categories:
    allFiles = os.listdir(os.path.join(test_folder, category))
    files = []
    for file in allFiles:
        test_images = test_images.append({'image_name': os.path.join(os.path.join(test_folder, category ,file)),
                                        'category': int(category)},
                                        ignore_index = True) if ('.jpg' in file) else None
test_images = test_images.sample(frac = 1)
In [78]:
test_images.describe()
Out[78]:
category
count 1461.000000
mean 3.130048
std 1.272817
min 1.000000
25% 2.000000
50% 3.000000
75% 4.000000
max 5.000000
In [97]:
category_plot = test_images['category'].value_counts().plot.bar()
plt.title('Test images distrution between target features')
plt.xlabel('Target bin count')
plt.ylabel('Number of Images')
plt.savefig('results/test_images_distrution_between_target_features.png')
In [57]:
def predict_image(image_path, label):
    img = Image.open(image_path, mode='r')
    img_bytes = image_to_byte_array(img)
    response=predictor.predict(img_bytes, initial_args={"ContentType": "image/jpeg"})
    result = int(np.argmax(response, 1)[0])
    return result
In [26]:
test_results = pd.DataFrame()

for index, row in test_images.iterrows():
    result = predict_image(row[1], row[0])
    test_results = test_results.append({'image_name': row[1],
                                        'category'  : int(row[0]),
                                        'prediction': result },
                                        ignore_index = True)
In [27]:
test_results.head()
Out[27]:
category image_name prediction
0 3.0 dataset/test/3/01010.jpg 2.0
1 2.0 dataset/test/2/10461.jpg 1.0
2 1.0 dataset/test/1/104130.jpg 1.0
3 4.0 dataset/test/4/05239.jpg 2.0
4 2.0 dataset/test/2/102842.jpg 1.0
In [96]:
prediction_plot = test_results['prediction'].value_counts().plot.bar()
plt.title('prediction distrution')
plt.xlabel('Target bin count')
plt.ylabel('Number of Images')
plt.savefig('results/prediction_distrution.png')
In [81]:
category = test_images['category'].value_counts().sort_index()
category = category.reindex([0,1,2,3,4,5], fill_value=0)
print(category)
prediction = test_results['prediction'].value_counts().sort_index()
prediction = prediction.reindex([0,1,2,3,4,5], fill_value=0)
print(prediction)
0      0
1    172
2    322
3    373
4    332
5    262
Name: category, dtype: int64
0    134
1    357
2    459
3    262
4    249
5      0
Name: prediction, dtype: int64
In [84]:
import numpy as np
import matplotlib.pyplot as plt

r= np.arange(len(category))
r2= np.arange(len(prediction))

width = 0.25

plt.bar(r, category, color = 'b',
        width = 0.25, edgecolor = 'black',
        label='category')

plt.bar(r2 + width, prediction, color = 'g',
        width = 0.25, edgecolor = 'black',
        label='prediction')
plt.xlabel("Number Of items")
plt.ylabel("Number of predictions")
plt.title("Comparing Ground Truth With Prediction")
  
plt.legend()
plt.savefig('results/Comparing_Ground_Truth_With_Prediction.png')
plt.show()

Calculating Accuracy¶

In [91]:
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report

accuracy_score(test_results['category'], test_results['prediction'])
Out[91]:
0.20191649555099248
In [94]:
print(classification_report(test_results['category'], test_results['prediction'], target_names=['0','1','2','3','4','5'], digits=4))
              precision    recall  f1-score   support

           0     0.0000    0.0000    0.0000         0
           1     0.1345    0.2791    0.1815       172
           2     0.2157    0.3075    0.2535       322
           3     0.2595    0.1823    0.2142       373
           4     0.3213    0.2410    0.2754       332
           5     0.0000    0.0000    0.0000       262

    accuracy                         0.2019      1461
   macro avg     0.1552    0.1683    0.1541      1461
weighted avg     0.2026    0.2019    0.1945      1461

/opt/conda/lib/python3.6/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/opt/conda/lib/python3.6/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/opt/conda/lib/python3.6/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/opt/conda/lib/python3.6/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/opt/conda/lib/python3.6/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/opt/conda/lib/python3.6/site-packages/sklearn/metrics/_classification.py:1245: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
In [95]:
# here once all work is done, we gonna shutdown/delete the endpoint 
predictor.delete_endpoint()

Multi-Instance Training¶

In [ ]:
#adjust this cell to accomplish multi-instance training
estimator = PyTorch(
    entry_point='code/train.py',
    base_job_name='inventory-monitoring',
    role=role,
    instance_count=5,
    instance_type='ml.g4dn.xlarge',
    framework_version='1.8',
    py_version='py36',
    hyperparameters=hyperparameters,
    output_path = "s3://udacity-capstone-project-2023/output-best/",
    ## Debugger and Profiler parameters
    rules = rules,
    debugger_hook_config=hook_config,
    profiler_config=profiler_config,
)
In [ ]:
os.environ['SM_CHANNEL_TRAINING']='s3://udacity-capstone-project-2023/'
os.environ['SM_MODEL_DIR']='s3://udacity-capstone-project-2023/models/'
os.environ['SM_OUTPUT_DATA_DIR']='s3://udacity-capstone-project-2023/output/'

estimator.fit({"training": "s3://udacity-capstone-project-2023/"})